Discussion:
VMS indexed files - how did they work?
(too old to reply)
t***@vrx.net
2006-05-11 18:48:12 UTC
Permalink
My favorite has always been the VMS "indexed" file type.

I really enjoyed working with them. You could do anything (quite
literally).

There are still things you could (can) do with them so simply that
can't be done, easily
if at all on modern RDBMS or other databases.

I liked the fact that everything was contained in one data file, no
external indexes or other external files as overhead.

And no need to keep track of the indexes in your programming, it was
all taken care of for you.

And you could literally "seek" any record by any string, even parts of
descriptive fields, etc.

Like today, if you wanted to use a flat text file for a fifo buffer,
you really can't. because you can't delete individual entries from a
text file, formatted (records) or not. You can blank out the record
data, but the record is still there.

I know the OS was hiding a lot of the overhead, but I always wondered
how indexed files on VMS actually worked. or is this still "secret
sauce" today?
Wilm Boerhout
2006-05-11 19:38:01 UTC
Permalink
Post by t***@vrx.net
My favorite has always been the VMS "indexed" file type.
I really enjoyed working with them. You could do anything (quite
literally).
Please, please do not use the past tense here. There are many
application environments today using RMS indexed files. Long may they last.
Post by t***@vrx.net
I know the OS was hiding a lot of the overhead, but I always wondered
how indexed files on VMS actually worked. or is this still "secret
sauce" today?
It is a matter of philosophy whether RMS is part of VMS. The RMS record
structure sure is older than VMS. OTOH, RMS is indeed very well
integrated into VMS.

How things "work" for the user is pretty well described in the various
RMS manuals that come with VMS.

How things "work" internally (what code is in place to fill the buckets,
so to speak) well, there have been "advanced RMS" courses in the past,
but I don't know if there's an "RMS Internals" manual around (/ping/ Hein)

/Wilm
Bill Todd
2006-05-11 19:41:48 UTC
Permalink
***@vrx.net wrote:

...
Post by t***@vrx.net
I know the OS was hiding a lot of the overhead, but I always wondered
how indexed files on VMS actually worked. or is this still "secret
sauce" today?
The 'sauce' was never secret: DEC did indexed file internal structure
presentations starting in the late '70s. There's probably a .pdf
presentation (ISTR one done in the late '80s or early '90s) available
somewhere still today, but I don't know where.

The fundamentals of how an indexed file was managed can be understood by
studying the mechanisms used in standard B+ trees (I justed checked
wikipedia.com's description and it seems reasonably accurate, though
RMS, like many commercial implementations, did not - at least early-on -
reshuffle data in lightly-filled leaf nodes but rather reclaimed a node
only if it became empty). RMS extended the mechanisms by supporting
leaf entries with duplicate key values, an unvarying identifier for each
user data record valid for the life of the file (later and optionally
possibly only for the life of the record on VMS), alternate indexes with
leaf entries that pointed to the relevant user data records using those
stable IDs, locking mechanisms at the record level (only at the page
level in its 16-bit incarnations) allowing consistent access by multiple
concurrent readers and writers), optimizations (as the product matured
on VMS) like key- and data-compression, and (also later and only on VMS)
features like journaling. Extensions to support multiple user data
record types within a single file were AFAIK discussed from nearly the
start but never implemented.

- bill
Malcolm Dunnett
2006-05-11 21:17:18 UTC
Permalink
Post by Bill Todd
Post by t***@vrx.net
I know the OS was hiding a lot of the overhead, but I always wondered
how indexed files on VMS actually worked. or is this still "secret
sauce" today?
The 'sauce' was never secret: DEC did indexed file internal structure
presentations starting in the late '70s. There's probably a .pdf
presentation (ISTR one done in the late '80s or early '90s) available
somewhere still today, but I don't know where.
I have a document from July 20,1977 called "RMS-11 Design
and Logic Manual" ( written by, among others, B. Todd ) which gives a
pretty comprehensive description of the internals of an RMS Indexed file.

This wasn't a formal publication but was something put together
"by hand". I got it from a local DEC Software Specialist many years ago.
Paul Williams
2006-05-12 17:46:36 UTC
Permalink
Post by Bill Todd
The 'sauce' was never secret: DEC did indexed file internal structure
presentations starting in the late '70s. There's probably a .pdf
presentation (ISTR one done in the late '80s or early '90s) available
somewhere still today, but I don't know where.
Manx shows that "RMS Structures and Utilities on VAX/VMS: Student Guide"
is online:

http://vt100.net/manx/details/1,3769
--
Paul
JF Mezei
2006-05-11 19:46:52 UTC
Permalink
Post by t***@vrx.net
I really enjoyed working with them.
I liked the fact that everything was contained in one data fileit was
all taken care of for you.
And you could literally "seek" any record by any string,
I know the OS was hiding a lot of the overhead, but I always wondered
how indexed files on VMS actually worked.
Whis have you written everything in the past tense ? VMS still exists
today, and RMS files are still used today.
Steve Lionel
2006-05-11 19:53:51 UTC
Permalink
VMS indexed files initially were a copy of IBM ISAM files from OS/370.
They first appeared in VMS 2.0. In the mid-70s, ISAM files were the
most popular "database" format and VMS needed them to break into the
commercial applications market (didn't hurt that VAX COBOL needed them
too.) I designed and implemented the VAX FORTRAN support for them in
1979.

Steve
Bill Todd
2006-05-11 20:17:50 UTC
Permalink
Post by Steve Lionel
VMS indexed files initially were a copy of IBM ISAM files from OS/370.
Not really - ISAM files were *strongly* associated with the physical
layout of the disks they resided on, and not conventional b-trees at
all. RMS resembled IBM's VSAM in several respects, though.
Post by Steve Lionel
They first appeared in VMS 2.0.
Actually, they appeared in VMS 1.0, but only as a compatibility-mode
RMS-11 implementation.

In the mid-70s, ISAM files were the
Post by Steve Lionel
most popular "database" format
I would have said that IBM's IMS was the most popular database format at
that time - and by some measures (such as the relative amount of
large-corporation centralized data under its management) it may still be
close to that today. Runners-up might perhaps have been Cullinane's
IDMS and Cincom's TOTAL, both of which were also 'network' databases (in
terms of internal and navigational structure, not in the 'distributed'
sense) and neither of which was tied to a single computer vendor.

Indexed files were file-system extensions that back then (before the
wide-spread adoption of Unix dumbed file systems down) were fairly
common across both minicomputer and mainframe vendors (there were no
serious PCs at the time), but no one that I can remember presumed to
characterize them as anything like 'databases' (though with a layer like
DATATRIEVE on top of them they could begin to look somewhat similar).

and VMS needed them to break into the
Post by Steve Lionel
commercial applications market (didn't hurt that VAX COBOL needed them
too.)
It might be closer to the mark to say that *DEC* needed them to break
into the commercial applications market. RMS on both the PDP-11 and on
VAX was initially developed in the same group (Languages and Data
Management, managed by Ron Ham) responsible for the 'commercial
languages' like COBOL, and VAX naturally inherited the products of this
group from their implementations on the 11 (though of course they were
mostly re-written to support the new environment).

- bill
Chris Scheers
2006-05-11 22:43:42 UTC
Permalink
Post by Steve Lionel
VMS indexed files initially were a copy of IBM ISAM files from OS/370.
They first appeared in VMS 2.0. In the mid-70s, ISAM files were the
most popular "database" format and VMS needed them to break into the
commercial applications market (didn't hurt that VAX COBOL needed them
too.) I designed and implemented the VAX FORTRAN support for them in
1979.
I was under the impression that Xerox's CP-V keyed files were the
inspiration for VMS indexed files.

Many VMS ideas seem to have come from CP-V.
--
-----------------------------------------------------------------------
Chris Scheers, Applied Synergy, Inc.

Voice: 817-237-3360 Internet: ***@applied-synergy.com
Fax: 817-237-3074
Bill Todd
2006-05-12 00:49:58 UTC
Permalink
Post by Chris Scheers
Post by Steve Lionel
VMS indexed files initially were a copy of IBM ISAM files from OS/370.
They first appeared in VMS 2.0. In the mid-70s, ISAM files were the
most popular "database" format and VMS needed them to break into the
commercial applications market (didn't hurt that VAX COBOL needed them
too.) I designed and implemented the VAX FORTRAN support for them in
1979.
I was under the impression that Xerox's CP-V keyed files were the
inspiration for VMS indexed files.
Many VMS ideas seem to have come from CP-V.
While I don't recall ever directly discussing with Ed Marison where his
ideas for RMS's indexed-file implementation came from, I do remember
that his previous work was on MUMPS-11, which I think itself had some
kind of indexed file facility that may have provided at least *some* of
the inspiration for RMS's.

- bill
Steve Lionel
2006-05-12 13:19:37 UTC
Permalink
I see I was unclear. I did not mean that the actual implementation in
RMS was taken from IBM. Rather, it was the feature set of IBM's
implementation that was imitated. I never heard Xerox CP-V being
discussed.

Steve
Bart Z. Lederman
2006-05-12 14:27:24 UTC
Permalink
As others have said, it was never secret.

There was also an RMS-11K for RSX-11M/M+ and RSTS/E that implemented
indexed files for the PDP-11 operating systems. I don't know if this
pre-dated the VAX, but it quite likely did. I don't believe there was
ever any 'inspiration' from MUMPS or other operating systems per-se:
the general idea of having indexed files was certainly one that had
been around for a while, and had been implemented in a number of ways
on a number of operating systems. The people who first implemented
indexed files within Digital probably got a number of good ideas by
looking at what other people did: this is normal procedure in a wide
variety of endevors.

(Anyone know if/when TOPS implemented indexed files?)

Bart.
Alan Greig
2006-05-12 14:56:49 UTC
Permalink
Post by Bart Z. Lederman
(Anyone know if/when TOPS implemented indexed files?)
RMS-20 certainly existed. Think it came somewhere about the TOPS-20 V4
timeframe but could have been earlier. Bill?

As I never made much use of it under TOPS-20, that I recall, I can't say
how complete it was. Docs should still be online somewhere. May have a dig.

I think RMS-10 existed as well.
--
Alan Greig
Bill Todd
2006-05-12 18:43:36 UTC
Permalink
Post by Alan Greig
Post by Bart Z. Lederman
(Anyone know if/when TOPS implemented indexed files?)
RMS-20 certainly existed. Think it came somewhere about the TOPS-20 V4
timeframe but could have been earlier. Bill?
RMS-20 was developed (largely by Seth Cohen, I think) in parallel with
RMS-11, and IIRC shipped in the late '70s (though possibly later than
RMS-11. which was first released in January, 1977). I don't recall
whether it supported indexed files or just sequential (and possibly
relative), but I do remember it as being relatively a shoe-string effort
compared with those on the 11 and VAX (largely to promote a measure of
program and data portability, possibly only for simpler stuff that
didn't require indexed-file support - which later was the motivation
behind the quasi-official limited RMS implementation on RT-11 as well).

- bill
Rich Alderson
2006-05-12 18:48:01 UTC
Permalink
Post by Bart Z. Lederman
(Anyone know if/when TOPS implemented indexed files?)
Just a reminder that there is no such thing as "TOPS". Tops-10 and Tops-20
shared exactly zero code base, file system design, or world view.

But both operating systems fielded RMS implementations.
--
Rich Alderson | /"\ ASCII ribbon |
***@alderson.users.panix.com | \ / campaign against |
"You get what anybody gets. You get a lifetime." | x HTML mail and |
--Death, of the Endless | / \ postings |
Bob Koehler
2006-05-15 13:04:33 UTC
Permalink
Post by Bart Z. Lederman
(Anyone know if/when TOPS implemented indexed files?)
TOPS-10 and -20 had byte-stream file sytems.

I used TOPS-20 after TOPS engineers admitted that they had merged
common parts of the Fortran, Cobol, and Pascal I/O libraries to form
an RMS. Although I never used the last version of TOPS-20, I only
recall support for things like more networking in the latter
releases.

The last major changes I recall actually using were: DECnet Phase
IV, storing the passwords in an "unreversable" encryption and adding EDT.
Alan Greig
2006-05-15 16:09:56 UTC
Permalink
Post by Bob Koehler
Post by Bart Z. Lederman
(Anyone know if/when TOPS implemented indexed files?)
TOPS-10 and -20 had byte-stream file sytems.
Well TOPS-10 was really just block I/O with the user managing their own
buffers for byte streams. TOPS-20 fully supported byte stream I/O with
bytes of any length (7 bit ascii the most common). TOPS-20 also did
memory mapped file I/O.
Post by Bob Koehler
I used TOPS-20 after TOPS engineers admitted that they had merged
common parts of the Fortran, Cobol, and Pascal I/O libraries to form
an RMS. Although I never used the last version of TOPS-20, I only
I think the only time I recall using RMS on TOPS-20 was in a simple
Basic+2 program just to see if it worked.
--
Alan Greig
Rich Alderson
2006-05-16 00:46:28 UTC
Permalink
Post by Alan Greig
Post by Bob Koehler
Post by Bart Z. Lederman
(Anyone know if/when TOPS implemented indexed files?)
TOPS-10 and -20 had byte-stream file sytems.
Well TOPS-10 was really just block I/O with the user managing their own
buffers for byte streams. TOPS-20 fully supported byte stream I/O with
bytes of any length (7 bit ascii the most common). TOPS-20 also did
memory mapped file I/O.
Tops-20 *supported* byte streams. The underlying mechanism, that is, what the
OS did to implement them, was memory-mapped I/O using the same virtual memory
as the paging system for memory management generally.
--
Rich Alderson | /"\ ASCII ribbon |
***@alderson.users.panix.com | \ / campaign against |
"You get what anybody gets. You get a lifetime." | x HTML mail and |
--Death, of the Endless | / \ postings |
Mike Rechtman
2006-05-21 20:47:44 UTC
Permalink
Post by Bart Z. Lederman
As others have said, it was never secret.
There was also an RMS-11K for RSX-11M/M+ and RSTS/E that implemented
indexed files for the PDP-11 operating systems. I don't know if this
pre-dated the VAX, but it quite likely did.
<snip>

RMS on RSTS definitely predated VMS. We used RSTS/E RMS (IIRC) on or
before version V7 (Of RSTS, that is) in a fairly large financial
application. Porting to Alpha was a matter of recompiling and relinking,
transferring the data files and Bobs your uncle..

Mike
Post by Bart Z. Lederman
Bart.
--
---------------------------------------------------------------------
Usual disclaimer: All opinions are mine alone, perhaps not even that.
Mike Rechtman ****@tzora.co.il*
Kibbutz Tzor'a. Voice (home): 972-2-9908337
"20% of a job takes 80% of the time, the rest takes another 80%"
---------------------------------------------------------------------
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCM/CS d(-)pu s:+>:- a++ C++ U-- L-- W++ N++ K? w--- V+++$
PS+ PE-- t 5? X- tv-- b+ DI+ D-- G e++ h--- r+++ y+++@
------END GEEK CODE BLOCK------
Bill Todd
2006-05-12 20:32:59 UTC
Permalink
Post by Steve Lionel
I see I was unclear. I did not mean that the actual implementation in
RMS was taken from IBM. Rather, it was the feature set of IBM's
implementation that was imitated.
Indeed, that was not clear - so I inferred that you were responding to
the question which was asked (which involved the details of RMS's
indexed-file implementation, not its feature set).

- bill
Hoff Hoffman
2006-05-11 20:00:01 UTC
Permalink
Post by t***@vrx.net
I know the OS was hiding a lot of the overhead, but I always wondered
how indexed files on VMS actually worked. or is this still "secret
sauce" today?
It's a standard non-relational database, with mechanisms for keyed
searches through the indexes and for sequential access -- I'm aware of
nothing here that is or was ever considered to be secret.

There is a book on file system internals available, and there is a
general intro and guide to the file system in the OpenVMS manuals posted
at <http://www.hp.com/go/openvms/doc/>.
Larry Kilgallen
2006-05-11 20:02:40 UTC
Permalink
Post by t***@vrx.net
I know the OS was hiding a lot of the overhead, but I always wondered
how indexed files on VMS actually worked. or is this still "secret
sauce" today?
To the level of approximation you seem to want, they work about like
ISAM (Indexed Sequential Access Method) files on IBM's MVS (Z/OS)
operating system. That is different from the builtin RDBMS on IBM's
OS400.

Lesser operating systems skip this capability, with the implication
that standardization of such on an operating system is not useful.
b***@instantwhip.com
2006-05-11 21:09:33 UTC
Permalink
"Lesser operating systems skip this capability, with the implication
that standardization of such on an operating system is not useful"

don't you mean "profitable"?
Dan O'Reilly
2006-05-11 23:16:19 UTC
Permalink
Post by Chris Scheers
Post by Steve Lionel
VMS indexed files initially were a copy of IBM ISAM files from OS/370.
They first appeared in VMS 2.0. In the mid-70s, ISAM files were the
most popular "database" format and VMS needed them to break into the
commercial applications market (didn't hurt that VAX COBOL needed them
too.) I designed and implemented the VAX FORTRAN support for them in
1979.
I was under the impression that Xerox's CP-V keyed files were the
inspiration for VMS indexed files.
Many VMS ideas seem to have come from CP-V.
Man, it's been a LONG time since I've heard talk of CP-V. I used it on a
Xerox Sigma-7 a whole lotta (more than I care to think about) years ago!

------
+-------------------------------+----------------------------------------+
| Dan O'Reilly | "There are 10 types of people in this |
| Principal Engineer | world: those who understand binary |
| Process Software | and those who don't." |
| http://www.process.com | |
+-------------------------------+----------------------------------------+
Bob Koehler
2006-05-12 12:56:21 UTC
Permalink
Post by t***@vrx.net
I know the OS was hiding a lot of the overhead, but I always wondered
how indexed files on VMS actually worked. or is this still "secret
sauce" today?
IS not WAS.

There is hashing data stored in the file. This is possible because
the file system actually knows the difference between data and
meta-data.

If you block dump the file you can see the meta-data. If the data is
all characters you can TYPE the file and get the data on your
terminal in logical (not physical) record order without seeing the
meta-data because TYPE uses the file system to read the records
sequentialy, it doesn't just copy raw bytes to the terminal.
Steve Matzura
2006-05-12 19:50:03 UTC
Permalink
Post by t***@vrx.net
My favorite has always been the VMS "indexed" file type.
I know the OS was hiding a lot of the overhead, but I always wondered
how indexed files on VMS actually worked. or is this still "secret
sauce" today?
Whatever it is and however it works, it goes back further than VMS.
I've been using indexed files since RSTS days in the late 70's, when
you had to link the RMS libraries, one for each file type (indexed,
relative and sequential) into the running image or "task" with the
"task-builder," what was the RSTS linker. There were object
descriptor language (ODL) files to optimize the loading of the running
code because before VMS you only had a 28Kword workspace--heck, the
machines running RSTS were only 510K, and what they did with that
512K, not to mention the 60KB slicing, was quite amazing. I remember
seeing DATATRIEVE for the first time in 1979-80 and marvelling at how
somebody came up with a program that could read indexed files that
didn't have to be compiled and task-built with the file record mapping
hard-coded in the program.
Steve Matzura
2006-05-14 09:54:41 UTC
Permalink
I gotta say, what I wouldn't give for that li'l jewel of a feature to
be available on other operating systems. BTrieve and everything else
is fine but requires too much work on the part of the programmer.
With indexed files, you open them, you map them, you read them, you
write them, you're done. A nice black-box DLL for Windows or
something similar for Unix would/could be a real money-maker.
Tom Linden
2006-05-14 12:47:46 UTC
Permalink
Post by Steve Matzura
I gotta say, what I wouldn't give for that li'l jewel of a feature to
be available on other operating systems. BTrieve and everything else
is fine but requires too much work on the part of the programmer.
With indexed files, you open them, you map them, you read them, you
write them, you're done. A nice black-box DLL for Windows or
something similar for Unix would/could be a real money-maker.
As you may know PL/I IO presumes an idexed file system. We have
partnered with another company to provide our ISAM package as a DLL
on Windows, stay tuned.
Steve Matzura
2006-05-16 21:42:41 UTC
Permalink
Post by Tom Linden
As you may know PL/I IO presumes an idexed file system. We have
partnered with another company to provide our ISAM package as a DLL
on Windows, stay tuned.
Nope, never been a PL/I programmer, but that's fascinating news!
Tom Linden
2006-05-16 23:49:06 UTC
Permalink
Post by Steve Matzura
Post by Tom Linden
As you may know PL/I IO presumes an idexed file system. We have
partnered with another company to provide our ISAM package as a DLL
on Windows, stay tuned.
Nope, never been a PL/I programmer, but that's fascinating news!
your misfortune:-)
Bob Koehler
2006-05-15 13:08:13 UTC
Permalink
Post by Steve Matzura
I gotta say, what I wouldn't give for that li'l jewel of a feature to
be available on other operating systems. BTrieve and everything else
is fine but requires too much work on the part of the programmer.
With indexed files, you open them, you map them, you read them, you
write them, you're done. A nice black-box DLL for Windows or
something similar for Unix would/could be a real money-maker.
DEC shipped a subset of Ingress in the last few releases of Ultrix
to address this need.

HP and Sun shipped a VAX-Fortran compatable f77 compiler for HP-UX
and Solaris which eventually did support the VMS extensions to the
I/O statements to handle indexed files. I assume this is hidden in
the f77 library.
Loading...