The useless and awkward .txt filename extension
For the past six or seven years, I have lived exclusively in the
Unix-like world, specifically, with the GNU/Linux operating system, or
as exclusively as it's possible in this green Earth of ours. Every computer user
today, no matter what their personal preferences of an operating
system may be, will have to interact now and then with Microsoft
operating systems or with their users. It's impossible to escape this
interaction completely, although it is possible to reduce it
to a bare minimum, as I have.
In this Unix-like domain (in which I'm hesitant to include MacOS X
despite their POSIX approval, as they break some long-standing
Unix traditions), there are several reigning philosophies and
traditions that clash with those of other operating systems. One of
them, which doesn't get nearly enough publicity as it should, is this:
every file which doesn't have to absolutely be a binary (i.e. a stream
of zeros and ones) either because it is an executable file, a
compressed file, or, more recently, because it is a media file, should
be text. This tradition can still be seen in one tool that saw its
birth in Unix: FTP, which still today refuses to die completely
despite the much better alternatives we have today. The FTP protocol
dictates that there are two transfer modes: text and binary,
reflecting the primary importance that the Unix-like world places on
text.
Underlying this prominence of text is another undercurrent:
openness. Openness has always been an undercurrent in the Unix-like
world, even before free Unix systems existed. The reason why text
is so important in Unix is that text is easy to view and edit. You
don't need any fancy software to do it, even if such software
exists. Every application can easily read and write text. This is why
editor wars are so prominent between us, why there are so many
editors, why so many administrative tasks are of the variety of "open this text
file and edit it this way," why so many text tools exist in Unix, why
even the shell itself has builtin text manipulation tools. This is
especially true of configuration and data files. The guiding principle
is this: you never know when a user might be interested in seeing or
changing the way your program works, so make it as easy for them as
possible; make it text. In the Unix-like world, text reigns supreme.
Openness is the antithesis of everything Microsoft stands for. They
view openness as a virus or a cancer, with some of its spokepeople
using that exact word, "cancer". Thus, in their world, the guiding
principle concerning text is this: you never know when you might be
exposing proprietary secrets to your users, so make it as hard as
possible to view data and configuration files; make them binary. If
you can further obfuscate your data, maybe even encrypt it, even
better. It must be hard to read.
On the relatively rare occasions when data, configuration, or
documentation isn't in some hard-to-read format in Microsoft operating
systems, when it is in fact text, Microsoft saw it fitting to
highlight this fact by labelling the data as such. This is the origin
of the .txt file extension in Microsoft operating systems. Text occurs
so rarely there, that it's important for them to highlight it when it
happens.
The .txt extension clashes with Unix-like OSes
Thus, whenever I get any sort of software distribution in my cozy
Debian boxen (I happen to have been using Debian exclusively for the
same six or seven years, although this is more out of habit than
conviction), and I see that someone put in there a .txt file, I
immediately know that a Windows user created this distribution, or at
the very least, someone who has to frequently interact with Windows
users.
It's harmless, of course. Some may argue that it's even useful. I say
it's redundant. Duh. Of course it's text! What else could it be? If it
were something that was meant to be executed, then it would have
executable permissions on, another concept from the Unix-like
world. If it were a media file, then it would have the appropriate
media file extension, a recent habit now that media files are so much
more commonplace. If it were a compressed archive, it should have the
appropriate extension too. Otherwise, it's text by default, and no
filename extension is necessary to highlight this. And in case it's
breaking this default tradition, then I can always check the file's
magic number by running the file(1) command on it and see what kind of
file it actually is, in case its name is wrong. After all, the only
operating system stupid enough to believe filename extensions is the
one on the other side of the fence, on the side that believes openness
to be anathema.
In a Unix-like operating system, there's a lot of text out there. When
I first realised this during my early days of first using Debian, I
was overjoyed. I could poke around and see how my operating system
worked a lot better than I ever could before. There are many shell,
Perl, and Python scripts scattered all throughout my operating system
accomplishing several tasks, as well as a handful in many other
scripting languages. Virtually all of the configuration files for
every program are in text, often heavily commented and documented
within the same configuration file. Most of the documentation is in
text too, or maybe with some slight formatting commands such as the
manpages, but ultimately text in the end. Every README, every manpage,
every TeXinfo file, it's either pure, pristine text, or text with a
few formatting commands that themselves are also in text. And let's
not forget that the source code for anything, which is nowadays widely
available in free operating systems, is also simply text. Text is
everywhere, and a .txt filename extension for us is almost as
descriptive as a .file filename extension, which is to say, not
descriptive at all.
When a file really is simply text without any special purpose, it
suffices to leave it without a filename extension. The operating
system will know its own.
The .txt filename extension is unnecessary
I can imagine one further reason why the proponents of the .txt
filename extension use it. Since Windows is an operating system that
believes filename extensions, if you change the filename extension,
Windows will change its behaviour regarding that file. Thus, if you
want the file to open in Notepad or perhaps Wordpad when you
double-click on it, the file must have the .txt extension. If
it doesn't have an extension at all, Windows refuses to do anything
with it and asks you instead what to do with it. Why it also refuses
to take a peek inside the file and see what sort of file it actually
is is beyond me. Maybe it thinks that peeking at files is a security
hazard because it could get infected with viruses.
Of course, GUIs in Unix-like operating systems tend to be a lot
smarter than this. Both KDE's and Gnome's graphical filesystem
browsers although they will initially believe the filename extension,
they will also check magic numbers of files if the file has no
extension. In particular, if it's a text file, they will open a text
editor when attempting to open the file. You may not like the default
text editor they use (and I often don't), but at least they know it's
text, and they will open a text editor.
There are a few situations where the .txt filename extension isn't
just unnecessary but downright misleading. This is the case when the
data contained inside the .txt file isn't "text" as we think of it
outside the computer world, i.e. prose or verse, some sort of natural
language. Sometimes, the .txt extension is used for configuration or
data files. Thus we have filenames like "preferences.txt",
"config.txt", or "data.txt". I have even seen source code with the
.txt extension, e.g. "cpp_program.txt". Sure, they probably are text
in format, but not in content, and more descriptive extensions could
be used if an extension really is deemed useful. Sadly, the relative
rarity of text configuration or data files outside of the Unix-like
world leads Windows users to use undescriptive extensions for their
files.
If you give a damn, eschew the .txt filename extension
I'll admit this is all a very frivolous discussion regarding filename
nomenclature. Whether you put or leave out a .txt extension in your
text files won't make a difference. But to us who live in the
Unix-like world, it looks out of place and is a sad reminder to us
that there exists another world where openness and text are so rare,
that it's important in that other world to explicitly mark it as such
when it happens. These users are like tourists who take camera
snapshots at objects and occurrences that are commonplace in our
own country. I would hope that when those users come to our land, they get
comfortable enough with our habits to leave out the .txt extension.
So do yourself a favour. Leave out the .txt extension when you come to our Unix-like land as a reminder to
yourself and everyone else that it's open by default, not closed.