cat'ing a binary messes up terminal - why?

David Lloyd lloy0076 at adam.com.au
Sat Oct 28 01:34:36 CST 2006


Ben,

> I have a query, not important, I'm just curious. Hopefully someone
> understands what's happening here.
> 
> If I say
> $ cat /bin/login
> 
> for example, my terminal gets all messed up - i.e. typing '/bin/login'
> now appears as
> 
> $ /␉␋┼/┌⎺±␋┼
> 
> I suppose some kind of character table translation is broken or
> something... can someone explain in detail why this happens? I NEED to
> know :)

The terminal accepts a stream of bits which it may interpret in 
different ways, for example:

  1. It may interpret a sequence as an instruction to display the letter
     "A"
  2. It may interpret certain sequences in different ways, for example,
     when you press "ctrl + c" (ctrl^c) a certain set of bits is sent to
     the terminal and that sends a "TERM" signal [generally] to the
     currently running process in the terminal

Now, different terminals have different capabilities; see:

  1. http://www.tldp.org/HOWTO/Text-Terminal-HOWTO-15.html
  2. man terminfo


Some terminals are obviously able to display "unicode" and other 
character sets.

Essentially, what I'm saying, is that a terminal will attempt to print 
out a sequence of bits as sensibly as it possibly can.

Now, obviously terminals can't really accept "random" sequences of bits. 
There needs to be some standard way to send sequences of bits to the 
terminal to get it to do something. This is called an "encoding" and 
yes, there is more than one possible encoding.

When you "cat" a binary file, though, there is no "sensible" encoding. 
Consequently, the terminal tries to do its best and you could get 
absolutely anything (including a sequence of characters or escapes that 
you don't want).

Hence, why you see garbage.

If you do see garbage, you should type:

  reset

...and most terminals will set themselves back into a known state. It is 
possible to send terminals sequences of bits that change encoding and/or 
do other weird things and you get weird results.

Consequently, the short answer is:

"It's showing you weird characters and making strange noises because the 
sequence of characters it is trying to display is not encoded in a way 
to display what you are sending it. It's the Garbage In Garbage Out 
principle.".

Why do you need to know?

DSL




More information about the linuxsa mailing list