What Does it Mean to Store Computer Data as Binary?
You may have been told that computers store all their data in 0s and 1s, a system known as binary. But did you know those ones are actually NOT ones but are actually 2s? Well technically they're not actually 2s, but when translating a binary number to a number that would make sense to you, each 1 in binary represents 2 to some exponent.
The first thing you have to understand about binary data is that it’s just numbers. That’s it. A bunch of ordinary numbers. So for example, if you wanted to store the number 5 in a computer’s memory, it would be stored as 101. So why do this? Why not just simply store it as 5? I’ll explain more on why later but the thing you have to know is that computer memory works like a bunch of tiny electric switches. This is because a switch is the simplest way to reliably store any information.
Imagine walking into a room with the lights off. You can definitively say that you know something about this room. You can be positive that someone turned off the lights. Now imagine coming back to that room later and the lights are on. The lights being in a state of “on” tells you definitively that someone has been in that room and flipped the switch. So we can say that information has been stored. In computing terms, we can say that the switch stores a "state". The switch is either in a state of on, or off.
So, because computers have all these tiny switches that can be in a state of on or off, we can assign a numerical value to those states. Since we only have two states for each switch, we have to use a numerical counting system that fits within those limitations. The Binary system fits perfectly for that. The number system you're probably most familiar with is called, decimal, or "base 10". Binary is "base 2". All the base means is the point where we reach the end of the amount of symbols we have to represent a number, so we "wrap around" to the next group of numbers.
Here's a visual explanation of base 10:
1, 2, 3, 4, 5, 6, 8, 9, 10 // we've run out of symbols to represent higher numbers so..
11, 12, 13, 14, 15, 16, 17, 18, 19, 20 // we 'wrap around' and start combining the symbols with 10
10+1, 10+2, 10+3, 10+4, 10+5,... // you can see how these numbers are really just additions to the base
Here's a visual explanation of base 2:
1, 2 // we've run out of symbols to represent higher numbers so..
2+1, 2+2, // just like with base 10, we wrap around and start adding to the base, which is 2
2+2+1, 2+2+2 // we it can go on and on like this
One other important thing to know about computing is that we always start counting at 0 rather than one. So instead of the base two system using 1 and 2, it just uses zero and one. I think the reason for this is that the counting system still works no matter what the symbols are, but if we used 1 and 2, we wouldn't have a way of representing the number zero, so starting with zero gives us that for free.
Bits and Bytes
So now that we know that pretty much any number can be converted to binary the next thing to understand is that each zero or one represents the smallest amount of information a computer can store. Those are called bits. I guess because they're the smallest bit of information.
So if a group of bits represents a number, we have to know where one number stops and another one ends. for example if we wanted to store the numbers 5, 32, and 24 there has to be some separation between them. If you saw 53224, it would mean something total different. So to 'break up' long sequences of bits, we have the concept of bytes a byte is just a group of 8 bits. Because the amount of bits that can store information is limited in a byte, it has a maximum number that it can store, 256 (values 0 - 255). But we can store larger numbers by stringing together more bytes. Maybe you've heard of something like the original Nintendo Entertainment System being an "8 bit" system. It just means its processor could only handle one byte of data at a time, which equates to numbers up to 255 (2^8). You can see why something like the Nintendo 64 was such a big deal because it could handle a lot more data. The maximum number that 64 bits can hold is 18,446,744,073,709,551,615 (2^64) There's a bit more to it than that because the range is different if you want to include negative numbers, but that should give you an idea of the differences.
More on the Math Behind It
Now, we can use this system to create just about any number.
Ok lets recap on how ‘regular’ (base 10) numbers work. For example, the number 751 in base 10 essentially means
7*100 + 5*10 + 1*1 BECAUSE every place we move to the left means we multiply that number symbol by a power of 10. Maybe you remember from grade school learning 1s place 10s place 100s place etc. this is the same as 10^0, 10^1, 10^2, 10^3, etc
So for binary numbers, we can create the same values in the same way as base 10 (aka decimal) but instead of using the symbols 0,1,2,3,4,5,6,7,8,9, we just use the symbols 0 and 1 and instead of multiplying each subsequent place to the left by a power of 10 and adding them, we multiply them by a power of 2 and add. So for example, to convert decimal 5 to binary, we would say:
1*2^2 + 0*2^1 + 1*2^0 = 4 + 0 + 1
Which looks like 101
But you can easily convert it to decimal in your head by thinking of it as
2^2 + 0^1 + 2^0
Because anything multiplied by zero is zero and anything to the zeroth power is one, and of course anything multiple by one is itself.
So yes, they are zeros and ones, but to convert them to numbers you can better visualize, this is an easy way to think of them.
And for bonus information, now that you understand how a computer can represent any number with a bunch of on and off switches, you might be wondering how do those numbers translate into letters, symbols, images, videos, etc. Ok, I’ll just give a brief explanation of letters. So there’s basically a big table of symbols called the ASCII table (American Standard Code for Information Interchange) and each symbol on it has a number associated with it…
So the computer can store text as numbers (binary) internally, then when it’s time to display the text back to you, it just does a lookup in that ASCII table and pulls in the symbol associated with that number. The symbol itself is basically a grid of ons and offs that allows the computer to ‘draw’ the symbol. There’s more to it than that but that’s the gist.
Oh and other bonus info: you can technically have any base for a number system. Humans (most likely) settled on 10 because we have 10 fingers and it’s easy to use them to count. But base 16 (hexadecimal) is commonly used as is base64. You’ve probably seen them even if you didn’t realize it. Computers just store and transmit in base two mainly because it’s hard to confuse something being on or off. It’s possible for them to use a base 3 system (low, med, high) and probably others but it’s rare, and not the standard because outside electromagnetic energy could influence the states to be slightly higher and slightly lower sometimes. This could cause inconsistencies in the data, so it’s more reliable to only have two states to worry about.