hehe, sorry about that. I've been pretty busy at work for the past few days. It's one of those weeks that everyone seems to go on holiday during and all the people left are either asking too many questions or are the people that just "float" at work. I call the slackers "floaters" because I'm not really sure what they do... I'm not so sure that they do either?!? hehehe. They are more than likely "Functional Analysts"... which irritates me. They are here because the "Business Analysts" and the "Technical People" apparently can't talk to each other. So, every time the "Business" needs something, the "Functional Analysts" step in to "translate" to the technical people about how something should "function" in a program... hehehe. You ever see a movie called Office Space? The dude that had "People skills" that basically did nothing.. hehe, yea, that's a Functional Analyst all over. They really just get in the way....
In my opinion, a functional analyst knows enough about technology to be dangerous in front of a computer in any other fashion other than reading and replying to email... It's pointless. Anyway, so yea, I've been busy with them unfortunately.
As to your questions. Yea, UTF8 has superimposed itself over top of ASCII and then added more. Basically, the first string of characters in UTF8 is ASCII. Unicode (UTF8) was ultimately needed because of 2 major reasons. These may or may not be the biggest reasons according to some book, but they are to me:
1) Writing a program or just a file that can cross-platform in English but also in any other language
2) A lot of character sets and other computer functions (like command line functions) started to bump heads. Basically, as ASCII was added onto to form larger sets, certain characters were also used in OS navigation.
That's why, in my opinion, Unicode will never go away. In saying that, ASCII will never go away either. Since ASCII is ultimately part of Unicode. (So is the Chinese character set, the Western Europe character set, et cetera)
Unicode is machine code driven (like ASCII was). So you really can't tell a computer.. , other than writing a program in something other than unicode, to not read Unicode. It will always read the first negative byte within code (say its 11000000) will tell the program how many bytes following it will be in the character set. So, the example shows that there will be 2 bytes that are to be used to translate the bits or bit set or bytes into a character. This is how it gets over 1 million characters rather than just 255
As for writing in binary so that a computer will understand it, you more or less have to run a program these days that accepts assembly language. The reason for this is because unless you build it from scratch, a computer is already built to declare and interpret variables. So, unless you really want to just build it again (just like re-inventing the wheel
) then you would just use a program set to code in bianry (or some sort of assembly program). in the Unix world.. there's plenty of them!