When using integer values in my own code, I always try to consider the signedness, asking myself if the integer should be signed or unsigned.


When I'm sure the value will never need to be negative, I then use an unsigned integer.
And I have to say this happen most of the time.


When reading other peoples' code, I rarely see unsigned integers, even if the represented value can't be negative.


So I asked myself: «is there a good reason for this, or do people just use signed integers because the don't care»?


I've search on the subject, here and in other places, and I have to say I can't find a good reason not to use unsigned integers, when it applies.


I came across those questions: «Default int type: Signed or Unsigned?», and «Should you always use 'int' for numbers in C, even if they are non-negative?» which both present the following example:


for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}

To me, this is just bad design. Of course, it may result in an infinite loop, with unsigned integers.
But is it so hard to check if foo.Length() is 0, before the loop?


So I personally don't think this is a good reason for using signed integers all the way.


Some people may also say that signed integers may be useful, even for non-negative values, to provide an error flag, usually -1.


Ok, that's good to have a specific value that means «error».
But then, what's wrong with something like UINT_MAX, for that specific value?


I'm actually asking this question because it may lead to some huge problems, usually when using third-party libraries.


In such a case, you often have to deal with signed and unsigned values.


Most of the time, people just don't care about the signedness, and just assign a, for instance, an unsigned int to a signed int, without checking the range.

大多数情况下,人们只是不关心签名,只是将一个unsigned int分配给signed int,而不检查范围。

I have to say I'm a bit paranoid with the compiler warning flags, so with my setup, such an implicit cast will result in a compiler error.


For that kind of stuff, I usually use a function or macro to check the range, and then assign using an explicit cast, raising an error if needed.


This just seems logical to me.


As a last example, as I'm also an Objective-C developer (note that this question is not related to Objective-C only):


- ( NSInteger )tableView: ( UITableView * )tableView numberOfRowsInSection: ( NSInteger )section;

For those not fluent with Objective-C, NSInteger is a signed integer.
This method actually retrieves the number of rows in a table view, for a specific section.


The result will never be a negative value (as the section number, by the way).


So why use a signed integer for this?
I really don't understand.


This is just an example, but I just always see that kind of stuff, with C, C++ or Objective-C.

这只是一个例子,但我总是看到那种东西,包括C,C ++或Objective-C。

So again, I'm just wondering if people just don't care about that kind of problems, or if there is finally a good and valid reason not to use unsigned integers for such cases.


Looking forward to hear your answers : )


5 个解决方案



  • a signed return value might yield more information (think error-numbers, 0 is sometimes a valid answer, -1 indicates error, see man read) ... which might be relevant especially for developers of libraries.

    有符号的返回值可能会产生更多信息(想想错误数字,0有时是有效答案,-1表示错误,请参阅man read)......这可能与图书馆的开发人员特别相关。

  • if you are worrying about the one extra bit you gain when using unsigned instead of signed then you are probably using the wrong type anyway. (also kind of "premature optimization" argument)

    如果你担心使用unsigned而不是signed而获得的额外一点,那么你可能正在使用错误的类型。 (也有种“过早优化”的说法)

  • languages like python, ruby, jscript etc are doing just fine without signed vs unsigned. that might be an indicator ...




There is one heavy-weight argument against widely unsigned integers:


Premature optimization is the root of all evil.


We all have at least on one occasion been bitten by unsigned integers. Sometimes like in your loop, sometimes in other contexts. Unsigned integers add a hazard, even though a small one, to your program. And you are introducing this hazard to change the meaning of one bit. One little, tiny, insignificant-but-for-its-sign-meaning bit. On the other hand, the integers we work with in bread and butter applications are often far below the range of integers, more in the order of 10^1 than 10^7. Thus, the different range of unsigned integers is in the vast majority of cases not needed. And when it's needed, it is quite likely that this extra bit won't cut it (when 31 is too little, 32 is rarely enough) and you'll need a wider or an arbitrary-wide integer anyway. The pragmatic approach in these cases is to just use the signed integer and spare yourself the occasional underflow bug. Your time as a programmer can be put to much better use.

我们都至少有一次被无符号整数咬了。有时像在你的循环中,有时在其他情况下。无符号整数会给你的程序增加一个危险,即使它很小。而你正在引入这种危险来改变一点的含义。一个小的,微小的,微不足道的 - 但是它的符号意义的位。另一方面,我们在面包和黄油应用中使用的整数通常远低于整数范围,更多的是10 ^ 1而不是10 ^ 7。因此,在绝大多数不需要的情况下,无符号整数的不同范围。当它需要时,很可能这个额外的位不会削减它(当31太小,32很少就足够了)并且你无论如何都需要更宽或任意宽的整数。在这些情况下,实用的方法是只使用有符号整数,并避免偶尔出现下溢错误。您作为程序员的时间可以更好地利用。



When using integer values in my own code, I always try to consider the signedness, asking myself if the integer should be signed or unsigned.


When I'm sure the value will never need to be negative, I then use an unsigned integer. And I have to say this happen most of the time.


To carefully consider which type that is most suitable each time you declare a variable is very good practice! This means you are careful and professional. You should not only consider signedness, but also the potential max value that you expect this type to have.


The reason why you shouldn't use signed types when they aren't needed have nothing to do with performance, but with type safety. There are lots of potential, subtle bugs that can be caused by signed types:


  • The various forms of implicit promotions that exist in C can cause your type to change signedness in unexpected and possibly dangerous ways. The integer promotion rule that is part of the usual arithmetic conversions, the lvalue conversion upon assignment, the default argument promotions used by for example VA lists, and so on.


  • When using any form of bitwise operators or similar hardware-related programming, signed types are dangerous and can easily cause various forms of undefined behavior.


By declaring your integers unsigned, you automatically skip past a whole lot of the above dangers. Similarly, by declaring them as large as unsigned int or larger, you get rid of lots of dangers caused by the integer promotions.

通过声明未签名的整数,您可以自动跳过上述大量危险。类似地,通过将​​它们声明为unsigned int或更大,可以消除由整数提升引起的大量危险。

Both size and signedness are important when it comes to writing rugged, portable and safe code. This is the reason why you should always use the types from stdint.h and not the native, so-called "primitive data types" of C.


So I asked myself: «is there a good reason for this, or do people just use signed integers because the don't care»?


I don't really think it is because they don't care, nor because they are lazy, even though declaring everything int is sometimes referred to as "sloppy typing" - which means sloppily picked type more than it means too lazy to type.

我并不认为这是因为他们不关心,也不是因为他们懒惰,即使宣称所有内容有时也被称为“草率打字” - 这意味着笨拙地挑选类型而不是意味着懒得打字。

I rather believe it is because they lack deeper knowledge of the various things I mentioned above. There's a frightening amount of seasoned C programmers who don't know how implicit type promotions work in C, nor how signed types can cause poorly-defined behavior when used together with certain operators.


This is actually a very frequent source of subtle bugs. Many programmers find themselves staring at a compiler warning or a peculiar bug, which they can make go away by adding a cast. But they don't understand why, they simply add the cast and move on.


for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}

for(unsigned int i = foo.Length() - 1; i> = 0; --i){}

To me, this is just bad design


Indeed it is.


Once upon a time, down-counting loops would yield more effective code, because the compiler pick add a "branch if zero" instruction instead of a "branch if larger/smaller/equal" instruction - the former is faster. But this was at a time when compilers were really dumb and I don't believe such micro-optimizations are relevant any longer.

曾几何时,向下计数循环会产生更有效的代码,因为编译器选择添加“分支如果为零”指令而不是“分支如果更大/更小/相等”指令 - 前者更快。但这是在编译器真的很愚蠢的时候,我不认为这种微观优化会再次相关。

So there is rarely ever a reason to have a down-counting loop. Whoever made the argument probably just couldn't think outside the box. The example could have been rewritten as:


for(unsigned int i=0; i

This code should not have any impact on performance, but the loop itself turned a whole lot easier to read, while at the same time fixing the bug that your example had.


As far as performance is concerned nowadays, one should probably spend the time pondering about which form of data access that is most ideal in terms of data cache use, rather than anything else.


Some people may also say that signed integers may be useful, even for non-negative values, to provide an error flag, usually -1.


That's a poor argument. Good API design uses a dedicated error type for error reporting, such as an enum.


Instead of having some hobbyist-level API like


int do_stuff (int a, int b); // returns -1 if a or b were invalid, otherwise the result

you should have something like:


err_t do_stuff (int32_t a, int32_t b, int32_t* result);

// returns ERR_A is a is invalid, ERR_B if b is invalid, ERR_XXX if... and so on
// the result is stored in [result], which is allocated by the caller
// upon errors the contents of [result] remain untouched

The API would then consistently reserve the return of every function for this error type.


(And yes, many of the standard library functions abuse return types for error handling. This is because it contains lots of ancient functions from a time before good programming practice was invented, and they have been preserved the way they are for backwards-compatibility reasons. So just because you find a poorly-written function in the standard library, you shouldn't run off to write an equally poor function yourself.)


Overall, it sounds like you know what you are doing and giving signedness some thought. That probably means that knowledge-wise, you are actually already ahead of the people who wrote those posts and guides you are referring to.


The Google style guide for example, is questionable. Similar could be said about lots of other such coding standards that use "proof by authority". Just because it says Google, NASA or Linux kernel, people blindly swallow them no matter the quality of the actual contents. There are good things in those standards, but they also contain subjective opinions, speculations or blatant errors.


Instead I would recommend referring to real professional coding standards instead, such as MISRA-C. It enforces lots of thought and care for things like signedness, type promotion and type size, where less detailed/less serious documents just skip past it.


There is also CERT C, which isn't as detailed and careful as MISRA, but at least a sound, professional document (and more focused towards desktop/hosted development).

还有CERT C,它不像MISRA那样详细和细致,但至少是一个健全的专业文档(更侧重于桌面/托管开发)。



From the C FAQ:

来自C FAQ:

The first question in the C FAQ is which integer type should we decide to use?

C FAQ中的第一个问题是我们应该决定使用哪种整数类型?

If you might need large values (above 32,767 or below -32,767), use long. Otherwise, if space is very important (i.e. if there are large arrays or many structures), use short. Otherwise, use int. If well-defined overflow characteristics are important and negative values are not, or if you want to steer clear of sign-extension problems when manipulating bits or bytes, use one of the corresponding unsigned types.


Another question concerns types conversions:


If an operation involves both signed and unsigned integers, the situation is a bit more complicated. If the unsigned operand is smaller (perhaps we're operating on unsigned int and long int), such that the larger, signed type could represent all values of the smaller, unsigned type, then the unsigned value is converted to the larger, signed type, and the result has the larger, signed type. Otherwise (that is, if the signed type can not represent all values of the unsigned type), both values are converted to a common unsigned type, and the result has that unsigned type.

如果操作涉及有符号和无符号整数,则情况稍微复杂一些。如果无符号操作数较小(可能我们在unsigned int和long int上操作),那么较大的有符号类型可以表示较小的无符号类型的所有值,那么无符号值将转换为较大的有符号类型,结果有更大的签名类型。否则(即,如果有符号类型不能表示无符号类型的所有值),则两个值都将转换为公共无符号类型,结果具有无符号类型。

You can find it here. So basically using unsigned integers, mostly for arithmetic conversions can complicate the situation since you'll have to either make all your integers unsigned, or be at the risk of confusing the compiler and yourself, but as long as you know what you are doing, this is not really a risk per se. However, it could introduce simple bugs.


And when it is a good to use unsigned integers? one situation is when using bitwise operations:


The <> operator shifts its first operand right. If the first operand is unsigned, >> fills in 0 bits from the left, but if the first operand is signed, >> might fill in 1 bits if the high-order bit was already 1. (Uncertainty like this is one reason why it's usually a good idea to use all unsigned operands when working with the bitwise operators.)

<<运算符将其第一个操作数左移第二个操作数给出的位数,在右侧填入新的0位。类似地,> >运算符将其第一个操作数右移。如果第一个操作数是无符号的,>>从左边填充0位,但如果第一个操作数是有符号的,如果高位已经是1,则>>可能填充1位。(这样的不确定性是一个原因在使用按位运算符时,使用所有无符号操作数通常是个好主意。)

taken from here And I've seen this somewhere:


If it was best to use unsigned integers for values that are never negative, we would have started by using unsigned int in the main function int main(int argc, char* argv[]). One thing is sure, argc is never negative.

如果最好使用无符号整数来表示从不为负的值,我们就可以在main函数int main中使用unsigned int(int argc,char * argv [])。有一件事是肯定的,argc永远不会消极。


As mentioned in the comments, the signature of main is due to historical reasons and apparently it predates the existence of the unsigned keyword.




Unsigned intgers are an artifact from the past. This is from the time, where processors could do unsigned arithmetic a little bit faster.


This is a case of premature optimization which is considered evil.


Actually, in 2005 when AMD introduced x86_64 (or AMD64, how it was then called), the 64 bit architecture for x86, they brought the ghosts of the past back: If a signed integer is used as an index and the compiler can not prove that it is never negative, is has to insert a 32 to 64 bit sign extension instruction - because the default 32 to 64 bit extension is unsigned (the upper half of a 64 bit register gets cleard if you move a 32 bit value into it).

实际上,2005年当AMD推出x86_64(或AMD64,它是如何被称为),x86的64位架构时,它们带来了过去的重影:如果有符号整数用作索引而编译器无法证明它永远不会是负数,必须插入一个32到64位的符号扩展指令 - 因为默认的32到64位扩展是无符号的(如果你将32位值移入其中,64位寄存器的上半部分会被清除) 。

But I would recommend against using unsigned in any arithmetic at all, being it pointer arithmetic or just simple numbers.


for( unsigned int i = foo.Length() - 1; i >= 0; --i ) {}

for(unsigned int i = foo.Length() - 1; i> = 0; --i){}

Any recent compiler will warn about such an construct, with condition ist always true or similar. With using a signed variable you avoid such pitfalls at all. Instead use ptrdiff_t.


A problem might be the c++ library, it often uses an unsigned type for size_t, which is required because of some rare corner cases with very large sizes (between 2^31 and 2^32) on 32 bit systems with certain boot switches ( /3GB windows).

一个问题可能是c ++库,它经常使用size_t的无符号类型,这是必需的,因为在具有某些启动开关的32位系统上有一些非常大的大小(在2 ^ 31和2 ^ 32之间)的罕见极端情况(/ 3GB的窗户)。

There are many more, comparisons between signed and unsigned come to my mind, where the signed value automagically gets promoted to a unsigned and thus becomes a huge positive number, when it has been a small negative before.


One exception for using unsigned exists: For bit fields, flags, masks it is quite common. Usually it doesn't make sense at all to interpret the value of these variables as a magnitude, and the reader may deduce from the type that this variable is to be interpreted in bits.


The result will never be a negative value (as the section number, by the way). So why use a signed integer for this?


Because you might want to compare the return value to a signed value, which is actually negative. The comparison should return true in that case, but the C standard specifies that the signed get promoted to an unsigned in that case and you will get a false instead. I don't know about ObjectiveC though.


  • Java各个版本新特性及Lambda表达式简介
