by Tdarcos » Tue Oct 22, 2019 6:31 pm
This article deals with the history of why older source code tends to be one large file ("monolithic") instead of broken into several smaller files ("modular"). If you're not interested in a history lesson you can skip this article.
I want to talk about a feature in programming in general - and interactive fiction in particular - that has more-or-less become available as a result of better editing tools and the development of source code modularization through tools like "make."
How many of you have ever read some of the older text adventure source code files such as DUNGEON (the predecessor to Zork.) Or the "grandfather" of all interactive fiction, the one that started it all, Crowther and Woods' Colossal Caves Adventure? If you don't know Fortran, probably never. Or if you were lucky, you had the opportunity to read it in one of the translations such as C or an IF authoring tool such as Hugo, TADS, AGT or something else.
But to paraphrase Klingon chancellor Gorkon in Star Trek VI: The Undiscovered Country, "You have not experienced programming IF until you've read Colossal Caves in the original Fortran." We live in, as far as software resources are concerned, "a post-scarcity economy." Computers are inexpensive and powerful; memory is plentiful, vast, and cheap, disk space is so inexpensive it's almost free, and displays are so sharp and clear for graphical images that it's almost heartbreaking.
But a look-back at 1970s technology reminds us of where we came from and resource limitations we had to live with. Colossal Caves ("CC") had to handle everything in upper case because mainframes hadn't shifted out of the Punched Card era (i explain that term later) when everything was in upper case to make sorting faster and save disk space. Colossal Caves compressed text input into 6-bits per character in order to fit 6 characters in a 36-bit word (PDP-10), or 5 characters in a 32-bit word (IBM 360/370, minicomputers). Memory was scarce and very expensive. Have you bought memory lately? A 4-GB memory rod might cost about $30, for which you are getting about 4 million K of ram. Back when CC was written, memory cost about $1,000 a K. Literally a dollar a byte. This meant computers didn't have a lot of memory and what they did have was precious.
Well anyway, because tools for modularizing source code were very limited, a lot of programs were written on coding sheets and placed, one line of 80 characters at a time, into dollar-bill-sized pieces of 100-pound paper called "punched cards." While you could break a program into subroutines, usually you had to put the whole program together as one monolithic block of code in order to submit it for compilation, and, hopefully (if you hadn't made any mistakes), execution. As people moved to terminals it became possible to save source code on the computer, but again, like everything else, disk space too was expensive. A disk drive which held 100 megabyte removable packs was the size of a washing machine or dishwasher today, and cost about $27,000 in 1970s dollars. Replacement packs were about a foot tall, the circumference of a dinner plate and cost $700.
Do you remember older PCs where every file took a multiple of 4K no matter the actual size? On larger files wasting as much as 3K wasn't that bad, but using 4K for a 300-byte file was painful. The same problem existed on mainframes but it was worse because, as noted above, disk space was lots more expensive. Creating one 75K or 200K source file used a lot less (very expensive) disk space than 50 or 100 small source files. Also, since test editors didn't support working with multiple files at once, working with one huge file was easier than lots of small ones.
Also, Fortran 66 - and some other languages used back then - didn't really have support for segmented source broken into separate files. IBM Fortran IV did not have the equivalent of the IINCLUDE statement, and neither did Fortran on the Dec-10, which is where CC originated. Looking at other languages, while COBOL had the COPY verb, you had to separate COPY source into a copy library, which was extra work. And still, the editing tools back then still didn't support editing multiple files simultaneously. If you can't quickly look from place to place in order to understand what is happening in the program you're working on, your only option is to merge everything together.
The tools we have now and the resources available make splitting large programs into many separate files trivial and the effective cost is so negligible as to round to zero. You have to get to levels of applications like the Linux kernel, Firefox or Apache to get into "serious" disk space usage levels. Back then, a serious on-line application might have 100,000 lines of code and 300 screens. Today, "serious" means at least 10 megabytes or 2,500 separate files.
So anyway, that's the history of why source code files tended to be monolithic instead of modular. I'll go on to my issue in a following article.
This article deals with the history of why older source code tends to be one large file ("monolithic") instead of broken into several smaller files ("modular"). If you're not interested in a history lesson you can skip this article.
I want to talk about a feature in programming in general - and interactive fiction in particular - that has more-or-less become available as a result of better editing tools and the development of source code modularization through tools like "make."
How many of you have ever read some of the older text adventure source code files such as DUNGEON (the predecessor to Zork.) Or the "grandfather" of all interactive fiction, the one that started it all, Crowther and Woods' [i]Colossal Caves Adventure[/i]? If you don't know Fortran, probably never. Or if you were lucky, you had the opportunity to read it in one of the translations such as C or an IF authoring tool such as Hugo, TADS, AGT or something else.
But to paraphrase Klingon chancellor Gorkon in [i]Star Trek VI: The Undiscovered Country,[/i] "You have not experienced programming IF until you've read Colossal Caves in the original Fortran." We live in, as far as software resources are concerned, "a post-scarcity economy." Computers are inexpensive and powerful; memory is plentiful, vast, and cheap, disk space is so inexpensive it's almost free, and displays are so sharp and clear for graphical images that it's almost heartbreaking.
But a look-back at 1970s technology reminds us of where we came from and resource limitations we had to live with. Colossal Caves ("CC") had to handle everything in upper case because mainframes hadn't shifted out of the Punched Card era (i explain that term later) when everything was in upper case to make sorting faster and save disk space. Colossal Caves compressed text input into 6-bits per character in order to fit 6 characters in a 36-bit word (PDP-10), or 5 characters in a 32-bit word (IBM 360/370, minicomputers). Memory was scarce and very expensive. Have you bought memory lately? A 4-GB memory rod might cost about $30, for which you are getting about 4 million K of ram. Back when CC was written, memory cost about $1,000 a K. Literally a dollar a byte. This meant computers didn't have a lot of memory and what they did have was precious.
Well anyway, because tools for modularizing source code were very limited, a lot of programs were written on coding sheets and placed, one line of 80 characters at a time, into dollar-bill-sized pieces of 100-pound paper called "punched cards." While you could break a program into subroutines, usually you had to put the whole program together as one monolithic block of code in order to submit it for compilation, and, hopefully (if you hadn't made any mistakes), execution. As people moved to terminals it became possible to save source code on the computer, but again, like everything else, disk space too was expensive. A disk drive which held 100 megabyte removable packs was the size of a washing machine or dishwasher today, and cost about $27,000 in 1970s dollars. Replacement packs were about a foot tall, the circumference of a dinner plate and cost $700.
Do you remember older PCs where every file took a multiple of 4K no matter the actual size? On larger files wasting as much as 3K wasn't that bad, but using 4K for a 300-byte file was painful. The same problem existed on mainframes but it was worse because, as noted above, disk space was [i]lots[/i] more expensive. Creating one 75K or 200K source file used a lot less (very expensive) disk space than 50 or 100 small source files. Also, since test editors didn't support working with multiple files at once, working with one huge file was easier than lots of small ones.
Also, Fortran 66 - and some other languages used back then - didn't really have support for segmented source broken into separate files. IBM Fortran IV did not have the equivalent of the IINCLUDE statement, and neither did Fortran on the Dec-10, which is where CC originated. Looking at other languages, while COBOL had the COPY verb, you had to separate COPY source into a copy library, which was extra work. And still, the editing tools back then still didn't support editing multiple files simultaneously. If you can't quickly look from place to place in order to understand what is happening in the program you're working on, your only option is to merge everything together.
The tools we have now and the resources available make splitting large programs into many separate files trivial and the effective cost is so negligible as to round to zero. You have to get to levels of applications like the Linux kernel, Firefox or Apache to get into "serious" disk space usage levels. Back then, a serious on-line application might have 100,000 lines of code and 300 screens. Today, "serious" means at least 10 megabytes or 2,500 separate files.
So anyway, that's the history of why source code files tended to be monolithic instead of modular. I'll go on to my issue in a following article.