Discussion:
CRTL and RMS vs SSIO
Add Reply
Greg Tinkler
2021-10-06 02:06:29 UTC
Reply
Permalink
I notice that SSIO (beta) in included in an up coming V9.1 field test. So I read up on the issues it is trying to solve.

One concerning thing was to have CRTL (via SSIO) access directly to XFC. From an architectural point of view this is wrong at so many levels, but if that is what needs to happen then open it up so RMS and other code bases can use it.

The main reason stated was the need to do byte offset/count IO’s. Well lets solve that first, change RMS by adding SYS$READB and SYS$WRITEB. These would be useful to all code using RMS.
SYS$READB read from byte offset for count, return latest data from that byte range.
SYS$WRITEB write from byte offset for count, update latest copy of underlying blocks.

SYS$WRITEB needs to use latest copy of data, and could use the new SSIO interface to XFC but RMS has it's own methods for this.
It may seem like a big ask getting all the latest blocks, but if you think about it it only needs to re-read the last and first block if it does not already have the latest copy. Also no need if the offset starts at the beginning of a block, and it fills the last block.

By having these as part of RMS we want to ensure the blocks/buffers are coordinated so any other user of RMS will see the changes, and we get their changes.

This seems to be at the core of the CRTL issue, it does NOT use RMS, nor does it synchronize its blocks/buffers, leading to the lost update problem.

So with this ‘simple’ addition the CRTL should be altered to us RMS for all file IO.

An extra that could be added, if the file is RFM=fixed, and the C code uses it that way with the same record length then use the SYS$GET/SYS$PUT so it will play nicely with an RMS access to those files.

Anyway just my 2 cent worth.

gt down under
Stephen Hoffman
2021-10-06 03:09:14 UTC
Reply
Permalink
Post by Greg Tinkler
I notice that SSIO (beta) in included in an up coming V9.1 field test.
So I read up on the issues it is trying to solve.
One concerning thing was to have CRTL (via SSIO) access directly to
XFC. From an architectural point of view this is wrong at so many
levels,...
Off the top, some of the various existing stuff that breaks layering on
OpenVMS includes HBVS volume shadowing, MOUNT, and byte-range locking.

IP as a layered product is broken layering.

The C select() call is a fine mess of mis-layering.

The XQP design is mis-layering.

There are other examples.

There are examples of breaking layering to advantage, such as ZFS
else-platform.

All discussions of layering and esthetics aside, I presume the primary
purpose of the SSIO project is to permit porting PostgreSQL to OpenVMS,
posthaste.
--
Pure Personal Opinion | HoffmanLabs LLC
Greg Tinkler
2021-10-06 03:32:50 UTC
Reply
Permalink
Post by Stephen Hoffman
Off the top, some of the various existing stuff that breaks layering on
OpenVMS includes HBVS volume shadowing, MOUNT, and byte-range locking.
IP as a layered product is broken layering.
The C select() call is a fine mess of mis-layering.
The XQP design is mis-layering.
There are other examples.
There are examples of breaking layering to advantage, such as ZFS
else-platform.
All discussions of layering and esthetics aside, I presume the primary
purpose of the SSIO project is to permit porting PostgreSQL to OpenVMS,
posthaste.
Yup, exactly, hence get CRTL to use RMS which does work.

Re byte range locking, why not just use locking granularity (aka Rdb) to do the job. Very efficient and has worked for decades, and no need to change VMS DLM. Sure it may be nice to have an API that does this for us, but hey we are programmers.

gt
Stephen Hoffman
2021-10-06 15:09:20 UTC
Reply
Permalink
Post by Greg Tinkler
Yup, exactly, hence get CRTL to use RMS which does work.
For this case, RMS really doesn't work at all well. Says why right
there in the name, too. Record management, not stream management.

C and IP have both been tussling with mismatched assumptions within the
OpenVMS file system since the instantiation of C on OpenVMS, too.

Lately, I've been tussling with the record-oriented assumptions within
OpenVMS. Records just never got as far along as objects. And RMS
records are an unmitigated joy around upgrades and mixed-version
clusters.

The various stream-format files are one of the ensuing compromises here.
Post by Greg Tinkler
Re byte range locking, why not just use locking granularity (aka Rdb)
to do the job. Very efficient and has worked for decades, and no need
to change VMS DLM.
The use of Oracle Rdb isn't viable as a dependency for many folks, and
lock granularity doesn't work at all well for arbitrary and overlapping
locking ranges.
Post by Greg Tinkler
Sure it may be nice to have an API that does this for us, but hey we
are programmers.
I don't want us each writing and debugging and maintaining
range-locking code for what is part of the C standard library, but you
do you.

As much as I'd like a general range-locking solution here in DLM, and
with adding (better?) stream I/O support into RMS, and as much as I'd
like to see OO API support added, and IP integration, and app and app
security integration with sandboxes, packaging, and package management,
and a whole pile of other badly-needed work, I'd infer that the folks
at VSI really want PostgreSQL as an available database option soonest.

There's a very long history of "can-kicking" here and a whole lot of
that is almost inherent and inevitable with the upward-compatibility
goals for the platform, and with resulting miasma far less visible to
those of us that have used OpenVMS for the past decade or three or
more, but is front and center with any new developer looking at the
APIs, and with any wholly new 64-bit app work.
--
Pure Personal Opinion | HoffmanLabs LLC
Craig A. Berry
2021-10-06 12:40:07 UTC
Reply
Permalink
Post by Greg Tinkler
An extra that could be added, if the file is RFM=fixed, and the C
code uses it that way with the same record length then use the
SYS$GET/SYS$PUT so it will play nicely with an RMS access to those files.
I don't know the degree to which the current plan corresponds to the
original plan from a decade or so ago, but back then only stream files
were going to be supported by SSIO, which makes sense since the whole
point is locking byte ranges.
David Jones
2021-10-06 13:18:55 UTC
Reply
Permalink
Post by Craig A. Berry
Post by Greg Tinkler
An extra that could be added, if the file is RFM=fixed, and the C
code uses it that way with the same record length then use the
SYS$GET/SYS$PUT so it will play nicely with an RMS access to those files.
I don't know the degree to which the current plan corresponds to the
original plan from a decade or so ago, but back then only stream files
were going to be supported by SSIO, which makes sense since the whole
point is locking byte ranges.
Open source software ports often comes with the restriction that it only works
with stream-LF files. Maybe they should add flag to directory files that if set
only allows it to contain stream-LF or directory files.

I keep a stmlf.fdl file in my login directory to use for copying (i.e. convert/fdl=...)
text files to NFS shares.
John Dallman
2021-10-06 19:04:00 UTC
Reply
Permalink
Post by David Jones
Open source software ports often comes with the restriction that it
only works with stream-LF files. Maybe they should add a flag to
directory files that if set only allows it to contain stream-LF
or directory files.
People used to UNIX or Windows generally find the other VMS file types
baffling and confusing. I got used to the idea, but never made use of
them, since my employers already had fewer customers on VMS than they did
UNIX when I joined, and the disparity only increased.

John
Greg Tinkler
2021-10-07 01:25:57 UTC
Reply
Permalink
What a good conversation, some feedback.
Post by Arne Vajhøj
To be honest then I think the safest way to implement this is
to put lots of restrictions on when it is doable.
* No cluster support (announcement already states that!)
* Only FIX 512, STMLF and UDF are supported
* no mixing with traditional RMS calls
My point is SSIO seems to be focused on just PostgreSQL, whereas an RMS solution is much much easier to program, uses well tested code, and is already cluster ready putting the team ahead of the game and not building issues for the future.
Post by Arne Vajhøj
I've a database product, a rather old product. At the time it was
implemented it was rather useful. But there was a locking issue. The
DLM locks resource names. The database would support I/O transfers of 1
to 127 disk blocks. How would one lock 127 contiguous disk blocks? The
blunt force method could be taking out 127 locks, not an optimum
solution. Having numeric range locking back in 1984 would have been
quite useful.
Yup DLM uses resource names, but they can be hierarchical, like a B-Tree index. Also the resources need only exist when needed, removed it not. The the lock tree size depends on the lock contention.

This is why I made reference to Rdb, it uses this technique, and they are probably not the only ones. NB each level controls a range of resources and each level can have it’s own fan out factor. The depth and lowest level is always dependant on the applications requirements.

FYI I am pretty sure RMS uses RFA to lock a record, this is an implied range of 1 record.
Post by Arne Vajhøj
No matter what the disk can do then the VMS file system is still
block oriented and I believe the system services take block offsets
not byte offsets.
All disks are block based, even on Unix. With some SSD’s yes you can do byte transfers, but this should be left to the driver to optimise. Also with X86_64 it weill be virtualised so what the..
Post by Arne Vajhøj
For this case, RMS really doesn't work at all well. Says why right
there in the name, too. Record management, not stream management.
Well yes and no. If you think about it most Unix text IO is record, ie LF terminated, and binary is fixed records not necessarily the same length in the file.

RMS for $GET and $PUT are record based, but $READ and $WRITE are block based, missing is $READB and $WRITEB, not just for CRTL but useful for various applications.

RMS ISAM with fixed length records is a pain, I have long argued ISAM should support variable length records, don’t care if they are VFC or STMLF, I would allow for both as VFC could allow for binary variable length records.

Likewise the keys on an ISAM file should be able to be variable based on a separator e.g “,” or <tab> or a combination.
Post by Arne Vajhøj
The use of Oracle Rdb isn't viable as a dependency for many folks, and
lock granularity doesn't work at all well for arbitrary and overlapping
locking ranges.
I think you will be a B-Tree style dynamic resource tree, similar to what Rdb uses, will work well. Any ‘byte range’ implementation will need some index to find interesting locks, DLM uses hash which is as efficient as you can get.
Post by Arne Vajhøj
Post by Greg Tinkler
Sure it may be nice to have an API that does this for us, but hey we
are programmers.
I don't want us each writing and debugging and maintaining
range-locking code for what is part of the C standard library, but you
do you.
NO, quite the opposite. I believe there is a POSIX standard for a locking API, and as VMS, sorry OpenVMS, wishes to maintain its POSIX stamp it should use these API’s using DLM underneath. NB DLM is also already cluster based, but you know that.
Post by Arne Vajhøj
People used to UNIX or Windows generally find the other VMS file types
baffling and confusing.
I always wondered why the CRTL did not have some smarts to present a VFC records as STMLF and vise-versa, effectively hiding the internal record structures. This could be done via open using the VMS extension “rfm=STMLF” which should be the default unless it is a binary file “rfm=unf”. If the file is VFC then CRTL could to the translation. Wishful thinking.

gt down under
Arne Vajhøj
2021-10-07 01:48:24 UTC
Reply
Permalink
Post by Greg Tinkler
What a good conversation, some feedback.
Post by Arne Vajhøj
To be honest then I think the safest way to implement this is
to put lots of restrictions on when it is doable.
* No cluster support (announcement already states that!)
* Only FIX 512, STMLF and UDF are supported
* no mixing with traditional RMS calls
My point is SSIO seems to be focused on just PostgreSQL, whereas an
RMS solution is much much easier to program, uses well tested code,
and is already cluster ready putting the team ahead of the game and
not building issues for the future.
I very much doubt that a full RMS solution is much easier.

:-)
Post by Greg Tinkler
Post by Arne Vajhøj
For this case, RMS really doesn't work at all well. Says why right
there in the name, too. Record management, not stream management.
Well yes and no. If you think about it most Unix text IO is record,
ie LF terminated, and binary is fixed records not necessarily the
same length in the file.
RMS for $GET and $PUT are record based, but $READ and $WRITE are
block based, missing is $READB and $WRITEB, not just for CRTL but
useful for various applications.
RMS ISAM with fixed length records is a pain, I have long argued ISAM
should support variable length records, don’t care if they are VFC or
STMLF, I would allow for both as VFC could allow for binary variable
length records.
????

Index-sequential files and RMS API supports variable length.

Not all language API's on top of RMS does.
Post by Greg Tinkler
Post by Arne Vajhøj
The use of Oracle Rdb isn't viable as a dependency for many folks, and
lock granularity doesn't work at all well for arbitrary and overlapping
locking ranges.
I think you will be a B-Tree style dynamic resource tree, similar to
what Rdb uses, will work well. Any ‘byte range’ implementation will
need some index to find interesting locks, DLM uses hash which is as
efficient as you can get.
Hash is effective for finding exact matches but useless for finding
other matches aka "starting with". For those a tree is better.

Arne
Lawrence D’Oliveiro
2021-10-07 02:00:36 UTC
Reply
Permalink
Post by Greg Tinkler
All disks are block based, even on Unix.
The difference being, on *nix systems, the responsibility for blocking and deblocking is left to the filesystem layer. So if a file is n bytes long, and n mod «sector size» ≠ 0, the application never sees what is in the padding bytes, if any.

Some filesystems even implement “tail packing”, which means the leftover bits of multiple files can share the same block, all transparently to the application, minimizing fragmentation.

By the way, Linus Torvalds did apparently use a VMS system at some point. (Must have been after his Sinclair QL days.) Guess what reason he gave, when asked why he hated it ...
Post by Greg Tinkler
RMS ISAM with fixed length records is a pain, I have long argued ISAM should support
variable length records ...
Given that nowadays an SQL-based RDBMS like SQLite can offer full support for transactions, joins and subqueries (missing only more multi-user-type features like locking and replication), and yet still be resource-light enough to fit in your mobile phone, I would say the time for application developers to be grubbing about in ISAM files is past.
Arne Vajhøj
2021-10-07 15:50:50 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
Given that nowadays an SQL-based RDBMS like SQLite can offer full
support for transactions, joins and subqueries (missing only more
multi-user-type features like locking and replication), and yet still
be resource-light enough to fit in your mobile phone, I would say the
time for application developers to be grubbing about in ISAM files is
past.
There are still cases where it make sense. RMS index-sequential files
are really a NoSQL Key Value Store in modern terminology and
they are still used and new ones even being developed (like
RocksDB).

But the default should change.

"use index-sequential file unless good reason to use relational database"

=>

"use relational database unless good reason to use
index-sequential file"

Arne
Dave Froble
2021-10-07 17:25:30 UTC
Reply
Permalink
Post by Arne Vajhøj
Post by Lawrence D’Oliveiro
Given that nowadays an SQL-based RDBMS like SQLite can offer full
support for transactions, joins and subqueries (missing only more
multi-user-type features like locking and replication), and yet still
be resource-light enough to fit in your mobile phone, I would say the
time for application developers to be grubbing about in ISAM files is
past.
There are still cases where it make sense. RMS index-sequential files
are really a NoSQL Key Value Store in modern terminology and
they are still used and new ones even being developed (like
RocksDB).
But the default should change.
"use index-sequential file unless good reason to use relational database"
=>
"use relational database unless good reason to use
index-sequential file"
Arne
I'd suggest there should not be a "default". Rather, make good
thoughtful decisions. Have valid reasons for any decisions or choices.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Dave Froble
2021-10-07 04:10:28 UTC
Reply
Permalink
Post by Greg Tinkler
What a good conversation, some feedback.
Post by Arne Vajhøj
To be honest then I think the safest way to implement this is
to put lots of restrictions on when it is doable.
* No cluster support (announcement already states that!)
* Only FIX 512, STMLF and UDF are supported
* no mixing with traditional RMS calls
My point is SSIO seems to be focused on just PostgreSQL, whereas an RMS solution is much much easier to program, uses well tested code, and is already cluster ready putting the team ahead of the game and not building issues for the future.
RMS is a bit too high level for what's being discussed.

But yeah, the real issue is that SSIO was aimed (it seems) at
PostgreSQL. In my opinion, that is poor software architecture and design.
Post by Greg Tinkler
Post by Arne Vajhøj
I've a database product, a rather old product. At the time it was
implemented it was rather useful. But there was a locking issue. The
DLM locks resource names. The database would support I/O transfers of 1
to 127 disk blocks. How would one lock 127 contiguous disk blocks? The
blunt force method could be taking out 127 locks, not an optimum
solution. Having numeric range locking back in 1984 would have been
quite useful.
Yup DLM uses resource names, but they can be hierarchical, like a B-Tree index. Also the resources need only exist when needed, removed it not. The the lock tree size depends on the lock contention.
Well the perceived issue is what happens when taking out locks, and at
some point there is a conflict. Say needing 127 blocks locked, and the
conflict is on the last block. That means 126 locks to be released, and
perhaps try again.

In reality, the large I/O buffer capability is rarely used, and then
it's usually with exclusive file access, which precludes the need for
block locks, just the file lock. For random access, single block
locking and I/O is good. Larger I/O buffers are usually used for
sequential access, both read only, and updating.
Post by Greg Tinkler
This is why I made reference to Rdb, it uses this technique, and they are probably not the only ones. NB each level controls a range of resources and each level can have it’s own fan out factor. The depth and lowest level is always dependant on the applications requirements.
FYI I am pretty sure RMS uses RFA to lock a record, this is an implied range of 1 record.
RMS has some interesting internals, basically below application usage.

Global buffers
Multiple buffers
Multi-block count

RMS can (I believe, it's been a long while) keep track of file usage,
and provide data from an RMS buffer to a user's buffer. No disk
activity required. Writes of course must go to disk. But even so, the
data can still be in the updated global buffers for use by multiple tasks.
Post by Greg Tinkler
Post by Arne Vajhøj
No matter what the disk can do then the VMS file system is still
block oriented and I believe the system services take block offsets
not byte offsets.
All disks are block based, even on Unix. With some SSD’s yes you can do byte transfers, but this should be left to the driver to optimise. Also with X86_64 it weill be virtualised so what the..
As long as storage is block oriented, then regardless of the numeric
range of bytes, all blocks encompassing the byte range will need to be
read, including locking, and written. This usually will include data
outside the byte range.
Post by Greg Tinkler
Post by Arne Vajhøj
For this case, RMS really doesn't work at all well. Says why right
there in the name, too. Record management, not stream management.
Ayep. RMS is record based.
Post by Greg Tinkler
Well yes and no. If you think about it most Unix text IO is record, ie LF terminated, and binary is fixed records not necessarily the same length in the file.
RMS for $GET and $PUT are record based, but $READ and $WRITE are block based, missing is $READB and $WRITEB, not just for CRTL but useful for various applications.
Forget RMS, I/O would be at the QIO level.
Post by Greg Tinkler
RMS ISAM with fixed length records is a pain, I have long argued ISAM should support variable length records, don’t care if they are VFC or STMLF, I would allow for both as VFC could allow for binary variable length records.
RMS keyed files can have variable record lengths.
RMS relative files require fixed length records. (if I remember correctly)
RMS sequential files can have variable record lengths.
Post by Greg Tinkler
Likewise the keys on an ISAM file should be able to be variable based on a separator e.g “,” or <tab> or a combination.
Post by Arne Vajhøj
The use of Oracle Rdb isn't viable as a dependency for many folks, and
lock granularity doesn't work at all well for arbitrary and overlapping
locking ranges.
I think you will be a B-Tree style dynamic resource tree, similar to what Rdb uses, will work well. Any ‘byte range’ implementation will need some index to find interesting locks, DLM uses hash which is as efficient as you can get.
Post by Arne Vajhøj
Post by Greg Tinkler
Sure it may be nice to have an API that does this for us, but hey we
are programmers.
I don't want us each writing and debugging and maintaining
range-locking code for what is part of the C standard library, but you
do you.
NO, quite the opposite. I believe there is a POSIX standard for a locking API, and as VMS, sorry OpenVMS, wishes to maintain its POSIX stamp it should use these API’s using DLM underneath. NB DLM is also already cluster based, but you know that.
Post by Arne Vajhøj
People used to UNIX or Windows generally find the other VMS file types
baffling and confusing.
That is because, without additional apps, Unix I/O is a stream of bytes.
There is no concept of records, such as that provided by RMS.

Frankly, (and yes, I'm biased), I find records reasonable, and a stream
of bytes baffling and confusing. Guess it's what one is used to.
Post by Greg Tinkler
I always wondered why the CRTL did not have some smarts to present a VFC records as STMLF and vise-versa, effectively hiding the internal record structures. This could be done via open using the VMS extension “rfm=STMLF” which should be the default unless it is a binary file “rfm=unf”. If the file is VFC then CRTL could to the translation. Wishful thinking.
I would suggest the use of "VMS" in the above, rather than "CRTL". That
is unless one considers the CRTL VMS ...
Post by Greg Tinkler
gt down under
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Lawrence D’Oliveiro
2021-10-07 07:54:51 UTC
Reply
Permalink
Post by Dave Froble
Frankly, (and yes, I'm biased), I find records reasonable, and a stream
of bytes baffling and confusing. Guess it's what one is used to.
Trouble is, there are many binary file formats that do not map easily to a simple sequence of records (of whatever delimitation). Consider the IFF family of file formats, for example: these are built out of chunks, and certain chunk types can contain other chunks.

For another example, consider file formats like TIFF and TTF, where there is a directory that identifies the location and size of the various major pieces. Oh, and PDF comes under this as well.

And then there are text-based format families, like XML, JSON, YAML, TOML ...
Arne Vajhøj
2021-10-07 14:48:58 UTC
Reply
Permalink
On Thursday, October 7, 2021 at 5:12:58 PM UTC+13, Dave Froble
Post by Dave Froble
Frankly, (and yes, I'm biased), I find records reasonable, and a
stream of bytes baffling and confusing. Guess it's what one is used
to.
Trouble is, there are many binary file formats that do not map easily
to a simple sequence of records (of whatever delimitation). Consider
the IFF family of file formats, for example: these are built out of
chunks, and certain chunk types can contain other chunks.
For another example, consider file formats like TIFF and TTF, where
there is a directory that identifies the location and size of the
various major pieces. Oh, and PDF comes under this as well.
The whole record things is mostly for text files and RMS based
database style usage.

Even on VMS then true binary files are usually FIX 512 (or in rare
cases UDF) with the structure handled entirely by the application.

Attempts to do otherwise often end up with 32K problems.
And then there are text-based format families, like XML, JSON, YAML, TOML ...
Different. Both on *nix and VMS that is a separate structure
on top of the basic file format.

Arne
David Jones
2021-10-07 15:37:04 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
Trouble is, there are many binary file formats that do not map easily to a simple sequence of records (of whatever delimitation). Consider the IFF family of file formats, for example: these are built out of chunks, and certain chunk types can contain other chunks.
Whatever happened to Compound Document Architecture (CDA)? It always struck me as an effort (now abandoned) toward an object oriented file structure.
Stephen Hoffman
2021-10-07 16:08:45 UTC
Reply
Permalink
Post by David Jones
Post by Lawrence D’Oliveiro
Trouble is, there are many binary file formats that do not map easily
to a simple sequence of records (of whatever delimitation). Consider
the IFF family of file formats, for example: these are built out of
chunks, and certain chunk types can contain other chunks.
Whatever happened to Compound Document Architecture (CDA)? It always
struck me as an effort (now abandoned) toward an object oriented file
structure.
DEC ceded the desktop app business.

The modern equivalent to CDA is PDF.
--
Pure Personal Opinion | HoffmanLabs LLC
Stephen Hoffman
2021-10-07 15:51:31 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
Post by Dave Froble
Frankly, (and yes, I'm biased), I find records reasonable, and a stream
of bytes baffling and confusing. Guess it's what one is used to.
Trouble is, there are many binary file formats that do not map easily
to a simple sequence of records (of whatever delimitation). Consider
the IFF family of file formats, for example: these are built out of
chunks, and certain chunk types can contain other chunks.
For another example, consider file formats like TIFF and TTF, where
there is a directory that identifies the location and size of the
various major pieces. Oh, and PDF comes under this as well.
And then there are text-based format families, like XML, JSON, YAML, TOML ...
There are many examples. It's far easier to map a whole executable
image into virtual memory or to use file system calls to load the whole
image into virtual memory, too. (This is an app design I never would
have considered on a VAX, too.)

For a number of apps and designs, I find RMS problematic for its
fondness for records in the lower parts of its position within the I/O
stack "funnel", and problematic again at somewhat higher levels of the
I/O stack "funnel" with what little RMS can do with those database
records it wants to enforce; its lack of marshaling and unmarshaling
for apps needing those services, among other sorts of designs, and with
all the usual "fun" with making changes to the contents and formats of
RMS records within apps.

Trying to make all apps fit within one NoSQL database really isn't all
that great of a solution. Getting PostgreSQL, SQLite, and other
databases better integrated is helpful. Longer-term and as I'd
mentioned in another reply, demoting 32-bit RMS to "just another local
database" status, too.

And to be absolutely clear here: if an app developer needs a NoSQL
database and as many apps can, having 32-bit RMS is entirely useful. At
least until the app developer needs to make changes or additions to the
record structures, when 32-bit RMS starts showing its age. A problem
related to how we now have roughly two-dozen files necessary within a
cluster configuration.
--
Pure Personal Opinion | HoffmanLabs LLC
Craig A. Berry
2021-10-07 12:50:19 UTC
Reply
Permalink
Post by Dave Froble
the real issue is that SSIO was aimed (it seems) at
PostgreSQL.
And Apache, and Samba, and other things that have been explicitly
mentioned as having needed app-specific workarounds due to the absence
of shared stream I/O support. SSIO *is* the general-purpose solution
that you seem to be lamenting the lack of.
Arne Vajhøj
2021-10-07 13:40:07 UTC
Reply
Permalink
Post by Craig A. Berry
the real issue is that SSIO was aimed (it seems) at PostgreSQL.
And Apache, and Samba, and other things that have been explicitly
mentioned as having needed app-specific workarounds due to the absence
of shared stream I/O support. SSIO *is* the general-purpose solution
that you seem to be lamenting the lack of.
Samba I totally get.

Multiple PC's writing to a file on a Samba share would create
some interesting scenarios.

But why does Apache need it?

It should read files to serve - and since it is serving VMS files
then I think it be as VMSish as possible so totally standard RMS.
And it should write sequential text files like access.log.

What am I missing?

Arne
Craig A. Berry
2021-10-07 13:51:08 UTC
Reply
Permalink
Post by Arne Vajhøj
Post by Craig A. Berry
the real issue is that SSIO was aimed (it seems) at PostgreSQL.
And Apache, and Samba, and other things that have been explicitly
mentioned as having needed app-specific workarounds due to the absence
of shared stream I/O support. SSIO *is* the general-purpose solution
that you seem to be lamenting the lack of.
Samba I totally get.
Multiple PC's writing to a file on a Samba share would create
some interesting scenarios.
But why does Apache need it?
It should read files to serve - and since it is serving VMS files
then I think it be as VMSish as possible so totally standard RMS.
And it should write sequential text files like access.log.
What am I missing?
log files (and probably the fact that multiple worker processes can be
writing to the same logs). And I forgot to mention that Java needs it
too. See:

<http://de.openvms.org/TUD2012/opensource_and_unix_portability.pdf>

Page 16 says:

• Java (CIFS too) uses a work-around
− Does open+read/write+close for every read/write!
− Restores current file offset after each close+open
− Significant performance issue
• Oracle problem with log and trace files
− Single writer with multiple readers
• Apache’s use of log files sub-optimal
− V1.3 places restriction
− V2.0 uses a work-around
Arne Vajhøj
2021-10-07 14:01:09 UTC
Reply
Permalink
Post by Craig A. Berry
Post by Arne Vajhøj
Post by Craig A. Berry
the real issue is that SSIO was aimed (it seems) at PostgreSQL.
And Apache, and Samba, and other things that have been explicitly
mentioned as having needed app-specific workarounds due to the absence
of shared stream I/O support. SSIO *is* the general-purpose solution
that you seem to be lamenting the lack of.
Samba I totally get.
Multiple PC's writing to a file on a Samba share would create
some interesting scenarios.
But why does Apache need it?
It should read files to serve - and since it is serving VMS files
then I think it be as VMSish as possible so totally standard RMS.
And it should write sequential text files like access.log.
What am I missing?
log files (and probably the fact that multiple worker processes can be
writing to the same logs).
I still don't get it.

I thought SSIO was about shared access to byte streams.

Writing to log files should be fine using good old record based
writes (somewhere down the call stack SYS$PUT).
Post by Craig A. Berry
  And I forgot to mention that Java needs it
<http://de.openvms.org/TUD2012/opensource_and_unix_portability.pdf>
• Java (CIFS too) uses a work-around
  − Does open+read/write+close for every read/write!
  − Restores current file offset after each close+open
  − Significant performance issue
In this context does "Java" mean "Tomcat"?

Arne
Craig A. Berry
2021-10-07 16:12:00 UTC
Reply
Permalink
Post by Arne Vajhøj
Post by Craig A. Berry
Post by Arne Vajhøj
Post by Craig A. Berry
the real issue is that SSIO was aimed (it seems) at PostgreSQL.
And Apache, and Samba, and other things that have been explicitly
mentioned as having needed app-specific workarounds due to the absence
of shared stream I/O support. SSIO *is* the general-purpose solution
that you seem to be lamenting the lack of.
Samba I totally get.
Multiple PC's writing to a file on a Samba share would create
some interesting scenarios.
But why does Apache need it?
It should read files to serve - and since it is serving VMS files
then I think it be as VMSish as possible so totally standard RMS.
And it should write sequential text files like access.log.
What am I missing?
log files (and probably the fact that multiple worker processes can be
writing to the same logs).
I still don't get it.
I thought SSIO was about shared access to byte streams.
Writing to log files should be fine using good old record based
writes (somewhere down the call stack SYS$PUT).
Don't ask me, ask the authors of the document to which I linked. Or the
folks at VSI who inherited their work. I may be wrong and it's not
about log files, but suppose it is. If you start from the premise that
the log files are stream-oriented and you have multiple writers and
multiple readers at the same time, then that's pretty much the
definition of shared access to a byte stream. Doing it differently for a
platform that prefers records would be extra cost and extra maintenance.
Post by Arne Vajhøj
Post by Craig A. Berry
                           And I forgot to mention that Java needs it
<http://de.openvms.org/TUD2012/opensource_and_unix_portability.pdf>
• Java (CIFS too) uses a work-around
   − Does open+read/write+close for every read/write!
   − Restores current file offset after each close+open
   − Significant performance issue
In this context does "Java" mean "Tomcat"?
You know as much as I do -- probably more ;-).
Arne Vajhøj
2021-10-07 17:27:22 UTC
Reply
Permalink
Post by Craig A. Berry
Post by Arne Vajhøj
I still don't get it.
Don't ask me, ask the authors of the document to which I linked. Or the
folks at VSI who inherited their work.
I know - I should not shoot the messenger. Sorry.

Arne
Dave Froble
2021-10-07 16:27:09 UTC
Reply
Permalink
Post by Arne Vajhøj
I still don't get it.
I thought SSIO was about shared access to byte streams.
That is a bit of tunnel vision.

Locking numeric ranges could be used for many other things. Such a
capability should be generic, not just for a single purpose.

That's the problem I see, the tunnel vision when approaching the issue,
rather than the vision to see just how useful the capability could be.

Craig's post points that out.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Arne Vajhøj
2021-10-07 16:59:57 UTC
Reply
Permalink
Post by Dave Froble
Post by Arne Vajhøj
I still don't get it.
I thought SSIO was about shared access to byte streams.
That is a bit of tunnel vision.
Not really. More like the definition.

<quote>
SSIO
====
Shared Stream IO feature provides POSIX compliant read/write to byte
stream files.
Hence SSIO feature, the data consistency is guaranteed when mutiple
processes are performing a Read/Write to non overlapping byte boundaries
with the same block boundary.
</quote>
Post by Dave Froble
Locking numeric ranges could be used for many other things.  Such a
capability should be generic, not just for a single purpose.
I agree that range locking is a useful feature for many other purposes
than SSIO.
Post by Dave Froble
That's the problem I see, the tunnel vision when approaching the issue,
rather than the vision to see just how useful the capability could be.
Craig's post points that out.
It listed some project that could benefit from SSIO besides
PostgreSQL.

And I just don't understand some of the examples since they
sound traditional record oriented to me.

Arne
Dave Froble
2021-10-07 16:18:28 UTC
Reply
Permalink
Post by Craig A. Berry
the real issue is that SSIO was aimed (it seems) at PostgreSQL.
And Apache, and Samba, and other things that have been explicitly
mentioned as having needed app-specific workarounds due to the absence
of shared stream I/O support. SSIO *is* the general-purpose solution
that you seem to be lamenting the lack of.
A while back we were discussing doing away with I/O to buffers, and
accessing the data in place. Slower access perhaps, but doing away with
the reading and writing to/from buffers. Haven't heard much about that
lately. I don't get out much.

Such type of activity would really benefit from having the capability of
locking just the required data, and, would need the capability of
reading and writing just the required data.

I'm aware of how useful something like SSIO would be. I'm just appalled
by the design and implementation. As mentioned, it seems aimed at just
a few current uses, and totally ignores how useful it would be for many
more future uses. This is rather consistent with the long time apathy
with which VMS has been treated. It's more a patch than an enhancement.
This is what I lament.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Stephen Hoffman
2021-10-07 17:01:49 UTC
Reply
Permalink
Post by Dave Froble
A while back we were discussing doing away with I/O to buffers, and
accessing the data in place. Slower access perhaps, but doing away
with the reading and writing to/from buffers. Haven't heard much about
that lately. I don't get out much.
Ayup. Nonvolatile byte-addressable storage hardware is available now,
and is in use in various applications.

Compatible memory hardware will be rather more available for OpenVMS
x86-64, for folks interested in investigating this for their apps.

Carving out a hunk of persistent storage will be interesting topic for
app developers on OpenVMS, though I can think of a couple of ways to
try.

Here's an HPE overview from a few years ago on the topic:
https://www.pdl.cmu.edu/SDI/2016/slides/keeton-2016-10-19-memory-driven-computing.pdf


I see some B-Tree work for this area in a newer paper, and a number of
other discussions.
Post by Dave Froble
Such type of activity would really benefit from having the capability
of locking just the required data, and, would need the capability of
reading and writing just the required data.
Locking access to the contents of a global section, or locking access
to hardware-backed storage for external devices, is the same issue.

Whether DLM overhead is too high for that to be workable is another
discussion that the app developers will want to ponder.
Post by Dave Froble
I'm aware of how useful something like SSIO would be. I'm just
appalled by the design and implementation. As mentioned, it seems
aimed at just a few current uses, and totally ignores how useful it
would be for many more future uses. This is rather consistent with the
long time apathy with which VMS has been treated. It's more a patch
than an enhancement. This is what I lament.
Alas, there's no other outcome when upward-compatibility is an
overarching goal for the platform.
--
Pure Personal Opinion | HoffmanLabs LLC
Dave Froble
2021-10-07 21:03:53 UTC
Reply
Permalink
Post by Stephen Hoffman
Post by Dave Froble
A while back we were discussing doing away with I/O to buffers, and
accessing the data in place. Slower access perhaps, but doing away
with the reading and writing to/from buffers. Haven't heard much
about that lately. I don't get out much.
Ayup. Nonvolatile byte-addressable storage hardware is available now,
and is in use in various applications.
Compatible memory hardware will be rather more available for OpenVMS
x86-64, for folks interested in investigating this for their apps.
Carving out a hunk of persistent storage will be interesting topic for
app developers on OpenVMS, though I can think of a couple of ways to try.
https://www.pdl.cmu.edu/SDI/2016/slides/keeton-2016-10-19-memory-driven-computing.pdf
I see some B-Tree work for this area in a newer paper, and a number of
other discussions.
Post by Dave Froble
Such type of activity would really benefit from having the capability
of locking just the required data, and, would need the capability of
reading and writing just the required data.
Locking access to the contents of a global section, or locking access to
hardware-backed storage for external devices, is the same issue.
Whether DLM overhead is too high for that to be workable is another
discussion that the app developers will want to ponder.
Post by Dave Froble
I'm aware of how useful something like SSIO would be. I'm just
appalled by the design and implementation. As mentioned, it seems
aimed at just a few current uses, and totally ignores how useful it
would be for many more future uses. This is rather consistent with
the long time apathy with which VMS has been treated. It's more a
patch than an enhancement. This is what I lament.
Alas, there's no other outcome when upward-compatibility is an
overarching goal for the platform.
Now I''m just a dumb polock, wandered down out of the woods. But I just
don't see where upward compatibility has anything to do with
enhancements to the DLM. If existing calls continue to work as before,
and only when an optional extra parameter would enable new capabilities,
then upward compatibility just cannot be an issue. At least for this.

The optional parameter might be a "lock type", and if not present,
existing logic would be used, and if present, new code could be executed
to process the new lock type. Stuff a couple of quadwords into the
resource name for the numeric range. It would add one new piece of data
to the DLM data structure(s).
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Greg Tinkler
2021-10-07 12:54:54 UTC
Reply
Permalink
Post by Dave Froble
Well the perceived issue is what happens when taking out locks, and at
some point there is a conflict. Say needing 127 blocks locked, and the
conflict is on the last block. That means 126 locks to be released, and
perhaps try again.
Maybe, maybe not. It depends on the locking fan out factors for the differing levels. It is possible that only 1 lock is needed, may be more, the wort case would be 127. NB there is also BLAST to assist with managing the lock promotion/demotions.
Post by Dave Froble
As long as storage is block oriented, then regardless of the numeric
range of bytes, all blocks encompassing the byte range will need to be
read, including locking, and written. This usually will include data
outside the byte range.
Yup, as is the case on Unix...let the drivers worry about how and why this is done, block/byte what ever the IO device needs.
Post by Dave Froble
Forget RMS, I/O would be at the QIO level.
Why? Underneath RMS is QIO, what RMS gives us the the coordination of the buffers/buckets/clumps/block across the cluster to ensure not lost updates, as per the example used to justify SSIO.
Post by Dave Froble
RMS keyed files can have variable record lengths.
True, VAR only not VFC or STM*, but fixed length key fields, with fixed offsets in the record
Post by Dave Froble
RMS relative files require fixed length records. (if I remember correctly)
Yup, there are implicitly fixed length.

===
Have been thinking about the byte range locking. As most of the use will be for locking ranges in a file it should be integrated with RMS, i.e. RMS should have an API to allow this as it already does the locking to the buffer/bucket/clump/block. Just need another 1 or 2 layers of lock tree and you have it. And it all be cluster wide, and it will be compatible with other users of RMS.

gt
Dave Froble
2021-10-07 16:50:39 UTC
Reply
Permalink
Post by Greg Tinkler
Post by Dave Froble
Well the perceived issue is what happens when taking out locks, and at
some point there is a conflict. Say needing 127 blocks locked, and the
conflict is on the last block. That means 126 locks to be released, and
perhaps try again.
Maybe, maybe not. It depends on the locking fan out factors for the differing levels. It is possible that only 1 lock is needed, may be more, the wort case would be 127. NB there is also BLAST to assist with managing the lock promotion/demotions.
Post by Dave Froble
As long as storage is block oriented, then regardless of the numeric
range of bytes, all blocks encompassing the byte range will need to be
read, including locking, and written. This usually will include data
outside the byte range.
Yup, as is the case on Unix...let the drivers worry about how and why this is done, block/byte what ever the IO device needs.
Post by Dave Froble
Forget RMS, I/O would be at the QIO level.
Why? Underneath RMS is QIO, what RMS gives us the the coordination of the buffers/buckets/clumps/block across the cluster to ensure not lost updates, as per the example used to justify SSIO.
Too limited and specific purpose. RMS might be able to make use of some
capabilities, but so might other applications.

RMS does some things well, and doesn't have some capabilities that it
perhaps should have. Data field definitions in records comes to mind.
Post by Greg Tinkler
Post by Dave Froble
RMS keyed files can have variable record lengths.
True, VAR only not VFC or STM*, but fixed length key fields, with fixed offsets in the record
Post by Dave Froble
RMS relative files require fixed length records. (if I remember correctly)
Yup, there are implicitly fixed length.
===
Have been thinking about the byte range locking. As most of the use will be for locking ranges in a file it should be integrated with RMS, i.e. RMS should have an API to allow this as it already does the locking to the buffer/bucket/clump/block. Just need another 1 or 2 layers of lock tree and you have it. And it all be cluster wide, and it will be compatible with other users of RMS.
Short sighted thinking. Numeric range locking might be useful in many
applications.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2021-10-07 12:26:36 UTC
Reply
Permalink
Post by Greg Tinkler
Post by Stephen Hoffman
For this case, RMS really doesn't work at all well. Says why right
there in the name, too. Record management, not stream management.
Well yes and no. If you think about it most Unix text IO is record, ie LF terminated, and binary is fixed records not necessarily the same length in the file.
How do you find byte 12,335,456 in a variable length RMS sequential file
without reading from the start of the file ?

That's why there are restrictions on RMS supported file formats in an
application in some cases.
Post by Greg Tinkler
I always wondered why the CRTL did not have some smarts to present a VFC
records as STMLF and vise-versa, effectively hiding the internal record
structures. This could be done via open using the VMS extension ?rfm=STMLF?
which should be the default unless it is a binary file ?rfm=unf?. If the file
is VFC then CRTL could to the translation. Wishful thinking.
This could not be the default. What if LF characters are part of the
existing data record itself ? You have just destroyed the meaning of
the file in that case.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Greg Tinkler
2021-10-07 12:42:39 UTC
Reply
Permalink
Post by Simon Clubley
How do you find byte 12,335,456 in a variable length RMS sequential file
without reading from the start of the file ?
That's why there are restrictions on RMS supported file formats in an
application in some cases.
The same way it is done on Unix, calculate the block offset, go get it, and extract the byte. no difference and nothing to do with the underlying format.
Post by Simon Clubley
Post by Greg Tinkler
I always wondered why the CRTL did not have some smarts to present a VFC
records as STMLF and vise-versa, effectively hiding the internal record
structures. This could be done via open using the VMS extension ?rfm=STMLF?
which should be the default unless it is a binary file ?rfm=unf?. If the file
is VFC then CRTL could to the translation. Wishful thinking.
This could not be the default. What if LF characters are part of the
existing data record itself ? You have just destroyed the meaning of
the file in that case.
Please read what I wrote, if the file has been opened "b" then don't, otherwise we need to assume it is stmLF. Yup probably another logical to set the default but I'm pretty sure if you create a new file using CRTL with the defaults then it will be stmLF anyway.

gt
Simon Clubley
2021-10-07 12:59:16 UTC
Reply
Permalink
Post by Greg Tinkler
Post by Simon Clubley
How do you find byte 12,335,456 in a variable length RMS sequential file
without reading from the start of the file ?
That's why there are restrictions on RMS supported file formats in an
application in some cases.
The same way it is done on Unix, calculate the block offset, go get it, and extract the byte. no difference and nothing to do with the underlying format.
You don't know the block offset without scanning the file when it comes
to some RMS file formats.

IOW, data byte 12,335,456 will not be the same thing as file byte 12,335,456
unless you restrict yourself to record formats that do not have embedded
record metadata.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Arne Vajhøj
2021-10-07 13:34:23 UTC
Reply
Permalink
Post by Simon Clubley
Post by Greg Tinkler
Post by Simon Clubley
How do you find byte 12,335,456 in a variable length RMS sequential file
without reading from the start of the file ?
That's why there are restrictions on RMS supported file formats in an
application in some cases.
The same way it is done on Unix, calculate the block offset, go get it, and extract the byte. no difference and nothing to do with the underlying format.
You don't know the block offset without scanning the file when it comes
to some RMS file formats.
IOW, data byte 12,335,456 will not be the same thing as file byte 12,335,456
unless you restrict yourself to record formats that do not have embedded
record metadata.
Yes.

And it does not get better when using standard C IO.

I suspect that the variable length file output below will
surprise a few *nix developers.

$ type var.txt
A
BB
CCC
$ type stmlf.txt
A
BB
CCC
$ type process.c
#include <stdio.h>
#include <sys/stat.h>

void sequential(const char *fnm, int mode)
{
FILE *fp;
int ix, c;
printf("%s sequential read (%s):", fnm, mode ? "binary" : "text");
fp = fopen(fnm, mode ? "rb" : "r");
ix = 0;
while((c = fgetc(fp)) >= 0)
{
ix++;
if(c >= 0)
{
printf(" %d=%02X", ix, c);
}
else
{
printf(" %d=-1", ix);
}
}
printf("\n");
fclose(fp);
}

void direct(const char *fnm, int mode, int siz)
{
FILE *fp;
int ix, c;
printf("%s direct read (%s):", fnm, mode ? "binary" : "text");
fp = fopen(fnm, mode ? "rb" : "r");
for(ix = 0; ix < siz; ix++)
{
fseek(fp, ix, SEEK_SET);
c = fgetc(fp);
if(c >= 0)
{
printf(" %d=%02X", ix + 1, c);
}
else
{
printf(" %d=-1", ix + 1);
}
}
printf("\n");
fclose(fp);
}

int main(int argc,char *argv[])
{
struct stat buf;
stat(argv[1], &buf);
printf("%s size = %d bytes\n", argv[1], (int)buf.st_size);
sequential(argv[1], 0);
sequential(argv[1], 1);
direct(argv[1], 0, (int)buf.st_size);
direct(argv[1], 1, (int)buf.st_size);
return 0;
}
$ cc process
$ link process
$ mcr sys$disk:[]process var.txt
var.txt size = 14 bytes
var.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
var.txt sequential read (binary): 1=41 2=42 3=42 4=43 5=43 6=43
var.txt direct read (text): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1 9=43
10=-1 11=-1 12=-1 13=FF 14=-1
var.txt direct read (binary): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1
9=43 10=-1 11=-1 12=-1 13=FF 14=-1
$ mcr sys$disk:[]process stmlf.txt
stmlf.txt size = 9 bytes
stmlf.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43
8=43 9=0A
stmlf.txt sequential read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43
8=43 9=0A
stmlf.txt direct read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
stmlf.txt direct read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A

Arne
Arne Vajhøj
2021-10-07 14:42:26 UTC
Reply
Permalink
        fseek(fp, ix, SEEK_SET);
var.txt size = 14 bytes
var.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
var.txt sequential read (binary): 1=41 2=42 3=42 4=43 5=43 6=43
var.txt direct read (text): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1 9=43
10=-1 11=-1 12=-1 13=FF 14=-1
var.txt direct read (binary): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1
9=43 10=-1 11=-1 12=-1 13=FF 14=-1
$ mcr sys$disk:[]process stmlf.txt
stmlf.txt size = 9 bytes
stmlf.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43
8=43 9=0A
stmlf.txt sequential read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43
8=43 9=0A
stmlf.txt direct read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
stmlf.txt direct read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
In all fairness then I believe there are some documentation
somewhere that states that fseek is only supported to
beginning of a record. I cannot find it right now,
but I believe I once saw it somewhere.

Arne
Craig A. Berry
2021-10-07 16:01:37 UTC
Reply
Permalink
Post by Arne Vajhøj
         fseek(fp, ix, SEEK_SET);
var.txt size = 14 bytes
var.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
var.txt sequential read (binary): 1=41 2=42 3=42 4=43 5=43 6=43
var.txt direct read (text): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1
9=43 10=-1 11=-1 12=-1 13=FF 14=-1
var.txt direct read (binary): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1
9=43 10=-1 11=-1 12=-1 13=FF 14=-1
$ mcr sys$disk:[]process stmlf.txt
stmlf.txt size = 9 bytes
stmlf.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43
8=43 9=0A
stmlf.txt sequential read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43
8=43 9=0A
stmlf.txt direct read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
stmlf.txt direct read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
In all fairness then I believe there are some documentation
somewhere that states that fseek is only supported to
beginning of a record. I cannot find it right now,
but I believe I once saw it somewhere.
Is this what you're looking for?

$ help crtl fseek description

CRTL

fseek

Description

The fseek function can position a fixed-length record-access
file with no carriage control or a stream-access file on any
byte offset, but can position all other files only on record
boundaries.

The available Standard I/O functions position a variable-length
or VFC record file at its first byte, at the end-of-file, or on
a record boundary. Therefore, the arguments given to fseek must
specify any of the following:

o The beginning or end of the file

o A 0 offset from the current position (an arbitrary record
boundary)

o The position returned by a previous, valid ftell call
Arne Vajhøj
2021-10-07 16:52:21 UTC
Reply
Permalink
Post by Craig A. Berry
Post by Arne Vajhøj
         fseek(fp, ix, SEEK_SET);
var.txt size = 14 bytes
var.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
var.txt sequential read (binary): 1=41 2=42 3=42 4=43 5=43 6=43
var.txt direct read (text): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1
9=43 10=-1 11=-1 12=-1 13=FF 14=-1
var.txt direct read (binary): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1
9=43 10=-1 11=-1 12=-1 13=FF 14=-1
$ mcr sys$disk:[]process stmlf.txt
stmlf.txt size = 9 bytes
stmlf.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43
8=43 9=0A
stmlf.txt sequential read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43
7=43 8=43 9=0A
stmlf.txt direct read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
stmlf.txt direct read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
In all fairness then I believe there are some documentation
somewhere that states that fseek is only supported to
beginning of a record. I cannot find it right now,
but I believe I once saw it somewhere.
Is this what you're looking for?
$ help crtl fseek description
CRTL
  fseek
    Description
         The fseek function can position a fixed-length record-access
         file with no carriage control or a stream-access file on any
         byte offset, but can position all other files only on record
         boundaries.
         The available Standard I/O functions position a variable-length
         or VFC record file at its first byte, at the end-of-file, or on
         a record boundary. Therefore, the arguments given to fseek must
         o  The beginning or end of the file
         o  A 0 offset from the current position (an arbitrary record
            boundary)
         o  The position returned by a previous, valid ftell call
YES.

And shame on me, because I only checked help crtl fseek arguments.

Arne
Dave Froble
2021-10-07 17:00:35 UTC
Reply
Permalink
Post by Arne Vajhøj
I suspect that the variable length file output below will
surprise a few *nix developers.
Why do you post C code examples that confuse me and give me a headache?

:-)

Then again, Basic code examples might confuse Unix developers ...
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Arne Vajhøj
2021-10-07 17:09:33 UTC
Reply
Permalink
Post by Dave Froble
Post by Arne Vajhøj
I suspect that the variable length file output below will
surprise a few *nix developers.
Why do you post C code examples that confuse me and give me a headache?
:-)
Then again, Basic code examples might confuse Unix developers ...
Sorry about the headache.

But the topic was identical code on *nix and VMS trying to
access a random position in a file.

C is available on both *nix and VMS so it was rather
obvious.

VMS Basic is not available on *nix.

I don't think there is quite the same options
in VMS Basic as in C for this, but I expect all the
options available in VMS Basic to produce a natural
expected result.

Arne
Simon Clubley
2021-10-07 17:53:34 UTC
Reply
Permalink
Post by Dave Froble
Post by Arne Vajhøj
I suspect that the variable length file output below will
surprise a few *nix developers.
Why do you post C code examples that confuse me and give me a headache?
:-)
Then again, Basic code examples might confuse Unix developers ...
Some of them might be aware of Basic.

Back in the later MS-DOS days, Microsoft used to ship a Basic
interpreter for free with MS-DOS and (apparently some Windows versions):

https://en.wikipedia.org/wiki/QBasic

I've just discovered there's a version of Microsoft QuickBasic for Linux:

https://en.wikipedia.org/wiki/FreeBASIC

which I did not know about.

Just been reminded that Gorillas.bas was released 30 years ago.

I am now depressed. :-)

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Arne Vajhøj
2021-10-07 18:13:45 UTC
Reply
Permalink
Post by Simon Clubley
Post by Dave Froble
Then again, Basic code examples might confuse Unix developers ...
Some of them might be aware of Basic.
Back in the later MS-DOS days, Microsoft used to ship a Basic
https://en.wikipedia.org/wiki/QBasic
GW-Basic came with DOS 1-4 and QBasic with DOS 5-6 and early Windows I
believe.

GW-Basic source code is now available at:
https://github.com/microsoft/GW-BASIC
Post by Simon Clubley
https://en.wikipedia.org/wiki/FreeBASIC
which I did not know about.
I would still not expect many Linux people to know Basic.

And besides VMS Basic is somewhat different from MS Basic flavors.

Arne
Dave Froble
2021-10-07 16:57:28 UTC
Reply
Permalink
Post by Simon Clubley
Post by Greg Tinkler
Post by Simon Clubley
How do you find byte 12,335,456 in a variable length RMS sequential file
without reading from the start of the file ?
That's why there are restrictions on RMS supported file formats in an
application in some cases.
The same way it is done on Unix, calculate the block offset, go get it, and extract the byte. no difference and nothing to do with the underlying format.
You don't know the block offset without scanning the file when it comes
to some RMS file formats.
IOW, data byte 12,335,456 will not be the same thing as file byte 12,335,456
unless you restrict yourself to record formats that do not have embedded
record metadata.
Simon.
I'm guessing Unix files don't have metadata and such. So the comparison
is not valid.

For a non-RMS file, yes, the location can be calculated. But not so for
an RMs file with record characteristics included in the records.

Since Unix doesn't have RMS files, perhaps that confused Greg.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2021-10-07 18:07:00 UTC
Reply
Permalink
Post by Dave Froble
Post by Simon Clubley
Post by Greg Tinkler
Post by Simon Clubley
How do you find byte 12,335,456 in a variable length RMS sequential file
without reading from the start of the file ?
That's why there are restrictions on RMS supported file formats in an
application in some cases.
The same way it is done on Unix, calculate the block offset, go get it, and extract the byte. no difference and nothing to do with the underlying format.
You don't know the block offset without scanning the file when it comes
to some RMS file formats.
IOW, data byte 12,335,456 will not be the same thing as file byte 12,335,456
unless you restrict yourself to record formats that do not have embedded
record metadata.
I'm guessing Unix files don't have metadata and such. So the comparison
is not valid.
No, Unix doesn't. At Unix filesystem level, files are just a stream of bytes.

The next layer up on Unix is the C RTL. There's nothing like RMS
between the filesystem and the C RTL on Unix.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Arne Vajhøj
2021-10-07 18:18:19 UTC
Reply
Permalink
Post by Simon Clubley
Post by Dave Froble
I'm guessing Unix files don't have metadata and such. So the comparison
is not valid.
No, Unix doesn't. At Unix filesystem level, files are just a stream of bytes.
The next layer up on Unix is the C RTL. There's nothing like RMS
between the filesystem and the C RTL on Unix.
The Unix file systems does not have meta data about how the
bytes are to be read/interpreted (like VMS: ORG, RFM, RAT,
MRS etc.). They do have some general meta data (owner,
protection, size, timestamp).

Arne
Stephen Hoffman
2021-10-07 15:34:48 UTC
Reply
Permalink
Post by Greg Tinkler
My point is SSIO seems to be focused on just PostgreSQL, whereas an RMS
solution is much much easier to program, uses well tested code, and is
already cluster ready putting the team ahead of the game and not
building issues for the future.
...

Fix the existing RMS data corruption in 32-bit RMS and/or in the C
library, and get PostgreSQL available on OpenVMS soonest. I expect this
is the priority for VSI.

Everything else is aspirational.

Integrate stream file access support at the XQP and allow C and C++ and
other non-punched-card-style app designs and stream- and OO-focused
languages to optionally bypass RMS entirely.

Better integrate and document the existing range-locking support
available within DLM.

And in aggregate, stop trying to make the current 32-bit RMS NoSQL
database more complex than it already is, and re-architect such that
32-bit RMS NoSQL database becomes just another available database, and
preferably while providing room for 64-bit RMS rather than trying
another OpenVMS Alpha V7.0-style 32-/64-bit or FAB/RAB/RAB64/NAM/NAML
design, and make 32- or (hypothetical) 64-bit RMS not the sole
persistent-storage "funnel" for structured file access for apps running
on OpenVMS, short of those few using XQP or LOG_IO or PHY_IO. Existing
RMS apps are already headed for "fun" as part of the upcoming 64-bit
LBN work for VSI and for apps, and a whole lot of those apps just won't
make it past messes similar to apps still tied to ODS-2 naming. I'd
wager that most existing apps don't yet fully support ODS-5 naming,
UTF-8 and all, too. Similar app messes with latent 32-bit RMS
dependencies.
--
Pure Personal Opinion | HoffmanLabs LLC
Dave Froble
2021-10-07 17:16:03 UTC
Reply
Permalink
Post by Stephen Hoffman
Post by Greg Tinkler
My point is SSIO seems to be focused on just PostgreSQL, whereas an
RMS solution is much much easier to program, uses well tested code,
and is already cluster ready putting the team ahead of the game and
not building issues for the future.
...
Fix the existing RMS data corruption in 32-bit RMS and/or in the C
library, and get PostgreSQL available on OpenVMS soonest. I expect this
is the priority for VSI.
Most likely.
Post by Stephen Hoffman
Everything else is aspirational.
Integrate stream file access support at the XQP and allow C and C++ and
other non-punched-card-style app designs and stream- and OO-focused
languages to optionally bypass RMS entirely.
I don't use C, so I don't know much about it. But isn't this capability
already available? Even RMS has the BLOCK I/O capability, at least from
Basic.

As far as I know, QIO doesn't know a thing about RMS. Well, the
directory structure does know RMS, and to an extent is RMS.
Post by Stephen Hoffman
Better integrate and document the existing range-locking support
available within DLM.
Yes, for sure. And if needed, make it much better.
Post by Stephen Hoffman
And in aggregate, stop trying to make the current 32-bit RMS NoSQL
database more complex than it already is, and re-architect such that
32-bit RMS NoSQL database becomes just another available database, and
preferably while providing room for 64-bit RMS rather than trying
another OpenVMS Alpha V7.0-style 32-/64-bit or FAB/RAB/RAB64/NAM/NAML
design, and make 32- or (hypothetical) 64-bit RMS not the sole
persistent-storage "funnel" for structured file access for apps running
on OpenVMS, short of those few using XQP or LOG_IO or PHY_IO. Existing
RMS apps are already headed for "fun" as part of the upcoming 64-bit LBN
work for VSI and for apps, and a whole lot of those apps just won't make
it past messes similar to apps still tied to ODS-2 naming. I'd wager
that most existing apps don't yet fully support ODS-5 naming, UTF-8 and
all, too. Similar app messes with latent 32-bit RMS dependencies.
Oh, no, Steve. That is much too logical and reasonable. Can't have
that. We must insure that things stay totally screwed up.

Don't know how far work had progressed on alternate file systems. Might
or might not help to make RMS "just another capability". But, doing
what you suggest would go a long way toward making VMS more useful in
the future.

I've got the suspicion that VMS clusters, while good, create some of the
problems in attempting to add new capabilities to VMS. Need I mention
"MOUNT"? Better segregation might help to add new and different
capabilities. Not sure how easy that might be.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Arne Vajhøj
2021-10-07 17:25:30 UTC
Reply
Permalink
Post by Stephen Hoffman
Integrate stream file access support at the XQP and allow C and C++ and
other non-punched-card-style app designs and stream- and OO-focused
languages to optionally bypass RMS entirely.
I don't use C, so I don't know much about it.  But isn't this capability
already available?  Even RMS has the BLOCK I/O capability, at least from
Basic.
C/C++ and most newer languages have a "stream view" of files while
RMS has a "record view" of files.

If they used different file systems everything would be fine.

If all text files are STMLF then it works and the "stream view"
and the "record view" produces consistent results.

But trying to mix on variable length or VFC files becomes
a minefield.

I know you don't like C, but try look at the example I posted.
Some of the outputs are very weird.

Arne
Simon Clubley
2021-10-07 18:28:04 UTC
Reply
Permalink
Post by Dave Froble
Don't know how far work had progressed on alternate file systems. Might
or might not help to make RMS "just another capability". But, doing
what you suggest would go a long way toward making VMS more useful in
the future.
I've got the suspicion that VMS clusters, while good, create some of the
problems in attempting to add new capabilities to VMS. Need I mention
"MOUNT"? Better segregation might help to add new and different
capabilities. Not sure how easy that might be.
VMS clusters at conceptual level are not the problem. They offer
some very nice functionality that only recently is beginning to
appear elsewhere. They were literally a generation ahead of what
was available elsewhere when they were released.

The problem is how VMS was designed in those early days before
modular and layered computing really took off.

The VMS filesystem code, including MOUNT as you say, is a _horrible_
monolithic mass of closely interlinked code without any clear
boundaries between them that allow people (including end users) to
easily plug in new functionality and new filesystems.

The same is true for VMS CLIs BTW. DCL is tightly bound into VMS
in a horrible way it should not be. On Linux, both the command
shell and filesystem architectures are vastly cleaner and more
modular than they are on VMS.

However, if VMS had been designed in a later era, there would be
absolutely nothing stopping VMS having a cleaner internal architecture
_and_ also having world-leading cluster capabilities that are only
now just being equalled elsewhere.

IOW, it's not clustering that's the problem - it's the fact that
VMS wasn't implemented 5 to 10 years later than it was.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Stephen Hoffman
2021-10-07 18:34:19 UTC
Reply
Permalink
IOW, it's not clustering that's the problem - it's the fact that VMS
wasn't implemented 5 to 10 years later than it was.
...Or that OpenVMS and its apps weren't later migrated to DEC MICA.
Which is kinda-sorta what you're referring to.
--
Pure Personal Opinion | HoffmanLabs LLC
Stephen Hoffman
2021-10-07 18:30:50 UTC
Reply
Permalink
Post by Dave Froble
Post by Stephen Hoffman
Post by Greg Tinkler
My point is SSIO seems to be focused on just PostgreSQL, whereas an RMS
solution is much much easier to program, uses well tested code, and is
already cluster ready putting the team ahead of the game and not
building issues for the future.
...
Fix the existing RMS data corruption in 32-bit RMS and/or in the C
library, and get PostgreSQL available on OpenVMS soonest. I expect this
is the priority for VSI.
Most likely.
Post by Stephen Hoffman
Everything else is aspirational.
Integrate stream file access support at the XQP and allow C and C++ and
other non-punched-card-style app designs and stream- and OO-focused
languages to optionally bypass RMS entirely.
I don't use C, so I don't know much about it. But isn't this
capability already available?
The C standard functions—the equivalent of the BASIC calls OPEN, READ,
WRITE, et al—are via RMS. There's no knob to tell C "don't do that".

The C default sequential file format creation format on OpenVMS is RMS
VFC, which has been a perpetual source of confusion and consternation
for users new to C on OpenVMS.
Post by Dave Froble
Even RMS has the BLOCK I/O capability, at least from Basic.
C doesn't do sector I/O within the standard library, though the native
platform calls are easily available.
Post by Dave Froble
As far as I know, QIO doesn't know a thing about RMS. Well, the
directory structure does know RMS, and to an extent is RMS.
$qio (and $io_perform) offer sector access through RMS (virtual),
record access through RMS (virtual), or access to device through the
file system (IO$_ACPCONTROL XQP), or direct access to the device driver
and device (logical and physical I/O).

The VIRT_IO virtual I/O paths through RMS and through the XQP are
cluster-aware, while the LOG_IO logical and PHY_IO physical I/O paths
are not.

RMS provides record locking for cluster coordination, while the XQP
provides coordination for the on-disk file system.
Post by Dave Froble
Post by Stephen Hoffman
Better integrate and document the existing range-locking support
available within DLM.
Yes, for sure. And if needed, make it much better.
Post by Stephen Hoffman
And in aggregate, stop trying to make the current 32-bit RMS NoSQL
database more complex than it already is, and re-architect such that
32-bit RMS NoSQL database becomes just another available database, and
preferably while providing room for 64-bit RMS rather than trying
another OpenVMS Alpha V7.0-style 32-/64-bit or FAB/RAB/RAB64/NAM/NAML
design, and make 32- or (hypothetical) 64-bit RMS not the sole
persistent-storage "funnel" for structured file access for apps running
on OpenVMS, short of those few using XQP or LOG_IO or PHY_IO. Existing
RMS apps are already headed for "fun" as part of the upcoming 64-bit
LBN work for VSI and for apps, and a whole lot of those apps just won't
make it past messes similar to apps still tied to ODS-2 naming. I'd
wager that most existing apps don't yet fully support ODS-5 naming,
UTF-8 and all, too. Similar app messes with latent 32-bit RMS
dependencies.
Oh, no, Steve. That is much too logical and reasonable. Can't have
that. We must insure that things stay totally screwed up.
I'd prefer an approach where there's some opportunity to ease new work
and new APIs into production, and to also retire overtly-busted APIs.

Oracle Rdb was really good at that migration and for as far as that
went, but most other apps and OpenVMS itself have not managed to copy
that. Not successfully.
Post by Dave Froble
Don't know how far work had progressed on alternate file systems.
Might or might not help to make RMS "just another capability". But,
doing what you suggest would go a long way toward making VMS more
useful in the future.
I've got the suspicion that VMS clusters, while good, create some of
the problems in attempting to add new capabilities to VMS. Need I
mention "MOUNT"? Better segregation might help to add new and
different capabilities. Not sure how easy that might be.
Oracle Rdb and some other databases have cluster access locking,
whether using DLM or database-level locking.

Other databases can be single-host.

The SQLite port to OpenVMS supports DLM and clustering.

PostgreSQL has been adding replication and clustering:
https://www.postgresql.org/docs/9.5/different-replication-solutions.html

Whether an OpenVMS port of PostgreSQL can incorporate DLM calls is
fodder for future discussions, once the SSIO prerequisite becomes
available and a hypothetical future PostgreSQL port becomes stable. A
stable PostgreSQL will interest some folks, with adoptions depending on
both intrinsic interest and, um, potential extrinsic factors not yet in
evidence.

And no, you need not mention MOUNT, having necessarily (re)written what
MOUNT provides on several occasions.
--
Pure Personal Opinion | HoffmanLabs LLC
Arne Vajhøj
2021-10-07 18:36:28 UTC
Reply
Permalink
Post by Stephen Hoffman
https://www.postgresql.org/docs/9.5/different-replication-solutions.html
Whether an OpenVMS port of PostgreSQL can incorporate DLM calls is
fodder for future discussions, once the SSIO prerequisite becomes
available and a hypothetical future PostgreSQL port becomes stable. A
stable PostgreSQL will interest some folks, with adoptions depending on
both intrinsic interest and, um, potential extrinsic factors not yet in
evidence.
PostgreSQL clusters are active/passive.

All updates and typical all reads goes to the active node
and updates get replicated from the active node to the passive nodes.

I believe it is possible to have the passive nodes support
reading.

But with only the active node taking updates then there
is no need for DLM.

(VMS people may not even call such a config a cluster, but ...)

Arne
Stephen Hoffman
2021-10-07 19:09:10 UTC
Reply
Permalink
PostgreSQL clusters are active/passive. ...
For folks interested in this general topic area with PostgreSQL around
failover and replication, please see the PostgreSQL documentation for
details.

Here's an updated link from what I'd posted earlier:
https://www.postgresql.org/docs/14/different-replication-solutions.html

If there's interest in adding what OpenVMS calls clustering within any
hypothetical future PostgreSQL port, use of the DLM will undoubtedly be
considered.

nb: PostgreSQL uses the term "cluster" for something entirely different
and unrelated to OpenVMS clustering.
--
Pure Personal Opinion | HoffmanLabs LLC
Lawrence D’Oliveiro
2021-10-07 01:51:31 UTC
Reply
Permalink
Post by John Dallman
People used to UNIX or Windows generally find the other VMS file types
baffling and confusing.
One question I never saw answered (because I never came across examples of files to check it) was whether in “VFC” files, the record count included the fixed header or not? And was that the same or different in the on-disk format versus the in-memory RMS structure with the “RSZ” (“RAB$W_RSZ”?) field?

By the way, I knew FORTRAN carriage control is now an anachronism, but I didn’t realize that it is now considered so obsolete, that compilers won’t support it any more.
Arne Vajhøj
2021-10-07 01:59:41 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
Post by John Dallman
People used to UNIX or Windows generally find the other VMS file types
baffling and confusing.
One question I never saw answered (because I never came across
examples of files to check it) was whether in “VFC” files, the record
count included the fixed header or not? And was that the same or
different in the on-disk format versus the in-memory RMS structure
with the “RSZ” (“RAB$W_RSZ”?) field?
Try it!

$ open/write z.z z.z
$ write z.z "ABC"
$ close z.z
$ dir/full z.z

Directory DISK2:[ARNE]

z.z;1 File ID: (5295,236,0)
...
Record format: VFC, 2 byte header, maximum 0 bytes, longest 3 bytes
...
$ dump z.z

Dump of file DISK2:[ARNE]z.z;1 on 6-OCT-2021 21:54:39.48
File ID (5295,236,0) End of file block 1 / Allocated 16

Virtual block number 1 (00000001), 512 (0200) bytes

00000000 00000000 00000000 00000000 00000000 0000FFFF 00434241
8D010005 ....ABC......................... 000000

Arne
Simon Clubley
2021-10-07 12:12:55 UTC
Reply
Permalink
Post by John Dallman
Post by David Jones
Open source software ports often comes with the restriction that it
only works with stream-LF files. Maybe they should add a flag to
directory files that if set only allows it to contain stream-LF
or directory files.
People used to UNIX or Windows generally find the other VMS file types
baffling and confusing. I got used to the idea, but never made use of
them, since my employers already had fewer customers on VMS than they did
UNIX when I joined, and the disparity only increased.
That because asking Unix/Windows people to learn about VMS records and
file structures is like asking a VMS person to learn about how to work
with records and files on z/OS using traditional z/OS methods.

It is something so very, very, different from what they are used to.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Lawrence D’Oliveiro
2021-10-07 01:45:41 UTC
Reply
Permalink
Post by David Jones
Open source software ports often comes with the restriction that it only works
with stream-LF files.
I would say that’s partially true. Typically there are options to treat files as “text” or “binary”. A “binary” file is just a stream of arbitrary 8-bit bytes, which are supposed to be read or written without any imposition of record boundaries, sector-size rounding or special treatment of any byte values. A “text” file is assumed to be broken up into lines. It is true that LF is the traditional Unix line delimiter. But enlightened toolkits like Python are capable of reading text files in “universal newline” mode, so for example if you copy a text file created on MS-DOS (line delimiter = CR+LF, because CP/M did it that way, for no rational reason) in binary mode onto a Linux system, your Python text-processing script running on the latter can cope with it without a hiccup.
Dave Froble
2021-10-06 13:45:19 UTC
Reply
Permalink
Post by Craig A. Berry
Post by Greg Tinkler
An extra that could be added, if the file is RFM=fixed, and the C
code uses it that way with the same record length then use the
SYS$GET/SYS$PUT so it will play nicely with an RMS access to those files.
I don't know the degree to which the current plan corresponds to the
original plan from a decade or so ago, but back then only stream files
were going to be supported by SSIO, which makes sense since the whole
point is locking byte ranges.
It has been my impression that for quite some time at HP, work on
specific requests tended to be very specific to that request, and failed
to consider capabilities as general to VMS.

The approach to SSIO appears to be an example of this. Basically, do
the least required to achieve the specific result. In the case of SSIO
the result appears to be rather useless, at least so far.

For some years I've advocated a more general enhancement to the VMS DLM,
specifically, numeric range locking. Such would address a basic issue
I've had with the VMS DLM for a rather long time.

I've a database product, a rather old product. At the time it was
implemented it was rather useful. But there was a locking issue. The
DLM locks resource names. The database would support I/O transfers of 1
to 127 disk blocks. How would one lock 127 contiguous disk blocks? The
blunt force method could be taking out 127 locks, not an optimum
solution. Having numeric range locking back in 1984 would have been
quite useful.

I've also suggested in the past that a simple enhancement to the DLM,
specifically the addition of a "type of lock" with the capability of
adding logic for specific "types" would solve the locking part of SSIO
and do so as a part of VMS, not as part of the CRTL.

As for byte range I/O, I'm not sure what is and isn't possible with disk
drives. It has been my impression that only whole block transfers are
possible. Perhaps I've been wrong. Perhaps SSDs have more flexibility.

Not really an issue for me anymore.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Arne Vajhøj
2021-10-06 14:37:05 UTC
Reply
Permalink
Post by Dave Froble
Post by Craig A. Berry
Post by Greg Tinkler
An extra that could be added, if the file is RFM=fixed, and the C
code  uses it that way with the same record length then use the
SYS$GET/SYS$PUT so it will play nicely with an RMS access to those files.
I don't know the degree to which the current plan corresponds to the
original plan from a decade or so ago, but back then only stream files
were going to be supported by SSIO, which makes sense since the whole
point is locking byte ranges.
It has been my impression that for quite some time at HP, work on
specific requests tended to be very specific to that request, and failed
to consider capabilities as general to VMS.
The approach to SSIO appears to be an example of this.  Basically, do
the least required to achieve the specific result.  In the case of SSIO
the result appears to be rather useless, at least so far.
General is better than specific.

When not considering resources.

My impression is that VSI engineering resources are very limited - and
several orders of magnitudes smaller than DEC 40 years ago.

So when they have the choice of solving something 80% for 200 hours of
effort or 100% for 1000 hours of effort then ...
Post by Dave Froble
For some years I've advocated a more general enhancement to the VMS DLM,
specifically, numeric range locking.  Such would address a basic issue
I've had with the VMS DLM for a rather long time.
I've also suggested in the past that a simple enhancement to the DLM,
specifically the addition of a "type of lock" with the capability of
adding logic for specific "types" would solve the locking part of SSIO
and do so as a part of VMS, not as part of the CRTL.
That would make sense to me.

But I do not count.
Post by Dave Froble
As for byte range I/O, I'm not sure what is and isn't possible with disk
drives.  It has been my impression that only whole block transfers are
possible.  Perhaps I've been wrong.  Perhaps SSDs have more flexibility.
No matter what the disk can do then the VMS file system is still
block oriented and I believe the system services take block offsets
not byte offsets.

Arne
Arne Vajhøj
2021-10-06 13:01:17 UTC
Reply
Permalink
Post by Greg Tinkler
I notice that SSIO (beta) in included in an up coming V9.1 field
test. So I read up on the issues it is trying to solve.
One concerning thing was to have CRTL (via SSIO) access directly to
XFC. From an architectural point of view this is wrong at so many
levels, but if that is what needs to happen then open it up so RMS
and other code bases can use it.
The main reason stated was the need to do byte offset/count IO’s.
Well lets solve that first, change RMS by adding SYS$READB and
SYS$WRITEB. These would be useful to all code using RMS. SYS$READB
read from byte offset for count, return latest data from that byte
range. SYS$WRITEB write from byte offset for count, update latest
copy of underlying blocks.
By having these as part of RMS we want to ensure the blocks/buffers
are coordinated so any other user of RMS will see the changes, and we
get their changes.
This seems to be at the core of the CRTL issue, it does NOT use RMS,
nor does it synchronize its blocks/buffers, leading to the lost
update problem.
So with this ‘simple’ addition the CRTL should be altered to us RMS for all file IO.
An extra that could be added, if the file is RFM=fixed, and the C
code uses it that way with the same record length then use the
SYS$GET/SYS$PUT so it will play nicely with an RMS access to those
files.
To be honest then I think the safest way to implement this is
to put lots of restrictions on when it is doable.

Examples:
* No cluster support (announcement already states that!)
* Only FIX 512, STMLF and UDF are supported
* no mixing with traditional RMS calls

Some applications coming over from *nix most known PostgreSQL needs
this. But trying to cover all types of cases would be a lot of
work.

Arne
Loading...