Binary Encoding of Integer
source link: https://dannypsnl.github.io/blog/2020/03/21/cs/binary-encoding-of-interger/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Binary Encoding of Integer
Z\mathbb{Z}Z contains positive and negative numbers, but nowdays Computer system based on binary. There only have 0
and 1
can be used. A simple solution is: put signed symbol at the most significant bit. For example, use 8 bits can represent:
This is an intuitive way. Therefore, early days computers applied it.
But this way(called Sign magnitude) has two big problems:
- waste a place: +0 and -0 are different in this system
- have to detect signed or not and use different ways to do an operation like addition
One's complement
To solve the second problem with the calculation complex, engineers introduce one's complement. For example, a number: −4310-43_{10}−4310 is 1001010112100101011_21001010112, the one's complement of it is 1110101002111010100_21110101002(keep the most significant bit and reverse the rest of bits). It still only works for −12710...12710-127_{10} ... 127_{10}−12710...12710 with 8 bits, didn't solve the first problem obviously, why we say it solves the second one? Because we can replace special digital electronics with two subtractions:
Then we convert negative number to one's complement, and remember end-around carry(the bit over most significant bit should be add back), then get:
Or 1110+00101110 + 00101110+0010 would get 1[0000]→00011[0000] \to 00011[0000]→0001
Benefit
- no need to check signed bit
- reverse bits can use addition on subtraction
Problem
- need a special unit for end-around carry
- there still have two zero representations
Two's complement
Modular Arithmetic
If we have an integer use 3 bits to represent, then 7+1≡0(mod8)7 + 1 \equiv 0(mod \; 8)7+1≡0(mod8) because of the overflow.
If we have k bits for an integer, then we can generalize to A+B≡C(mod2k)A + B \equiv C(mod \; 2^k)A+B≡C(mod2k).
Abelian group
For additive, we have to implement an Abelian group: an Identity element and every element has an Inverse element that adds them would get the Identity element.
Now consider how can our 3-bits integer fit rules of Abelian group? If [000]
represents 0
, it's identity in our group. Assuming [001]
is 1
, how to know the representation of -1
? It's easy to get [111]
this answer because we know [111] + [001] = 1[000]
, overflow would be dropped. And it cannot be any number except -1
, else we cannot find others inverse element of 1
. With this, we can keep finding an inverse for 2
([010]
) is [110]
and so on.
On the other hand, it still simple: -1
is the biggest negative number in 3-bits encoding. Follows unsigned order, [111] > [110]
, then don't need another way to compare numbers.
Properties
Now we can see we map all possible combinations of bits to an exact number. Solve the first problem in previous solutions.
Even more, the subtractive of signed integer is additive of unsigned number: 010 + 101
can be 2 + 5
or 2 + (-3)
. Therefore:
Now we know we don't need subtractive, it can be replaced with one's complement plus one.
Back to group, one's complement would waste place about encoding is a thing must happen, because in 3-bits encoding it maps [111]
and [000]
. In two's complement the axis of symmetry pass through [000]
, therefore, we won't get repetitive representations of zero.
Extend this [001]
maps to [111]
in two's complement, maps to [110]
in one's complement. Therefore, one's complement plus one would be two's complement.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK