Discussion:
Indexed file read question
Add Reply
abrsvc
2020-11-18 15:23:13 UTC
Reply
Permalink
I was asked the following question and am not 100% sure of the answer, so...

Situation: Indexed file with 2 keys, the first key no duplicates, second key no duplicates. For clarity, the first key is a record number and the second key is one of 4 values: A,B,L or blank.

A read is posted using the primary key. The next read uses the secondary key say of value A. Does the second read select the matching record within the context of the primary key or just the first record encountered with a matching secondary key?

I believe that the record context remains. Am I correct?
Stephen Hoffman
2020-11-18 16:20:50 UTC
Reply
Permalink
Post by abrsvc
Situation: Indexed file with 2 keys, the first key no duplicates,
second key no duplicates. For clarity, the first key is a record
number and the second key is one of 4 values: A,B,L or blank.
With only four records possible, why is this even a file? Which
implies there *are* duplicates on the secondary?
--
Pure Personal Opinion | HoffmanLabs LLC
Arne Vajhøj
2020-11-18 16:23:55 UTC
Reply
Permalink
Situation:  Indexed file with 2 keys, the first key no duplicates,
second key no duplicates.  For clarity, the first key is a record
number and the second key is one of 4 values: A,B,L or blank.
With only four records possible, why is this even a file?  Which implies
there *are* duplicates on the secondary?
I suspect that he means no duplicates for second key PER FIRST KEY.

It was not what was written, but it is what makes sense.

Arne
Dave Froble
2020-11-18 16:36:55 UTC
Reply
Permalink
Post by Arne Vajhøj
Post by Stephen Hoffman
Post by abrsvc
Situation: Indexed file with 2 keys, the first key no duplicates,
second key no duplicates. For clarity, the first key is a record
number and the second key is one of 4 values: A,B,L or blank.
With only four records possible, why is this even a file? Which
implies there *are* duplicates on the secondary?
I suspect that he means no duplicates for second key PER FIRST KEY.
It was not what was written, but it is what makes sense.
Arne
If I remember correctly (RMS is not my database of choice) that is not
how the keys work. Steve is correct, only 4 records allowed, according
to the description. The only way I see more records is that the second
key is actually the first key (or some other data) plus the second key.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
abrsvc
2020-11-18 16:35:41 UTC
Reply
Permalink
Situation: Indexed file with 2 keys, the first key no duplicates,
second key no duplicates. For clarity, the first key is a record
number and the second key is one of 4 values: A,B,L or blank.
With only four records possible, why is this even a file? Which
implies there *are* duplicates on the secondary?
--
Pure Personal Opinion | HoffmanLabs LLC
I have over simplified the situation just to provide a context for the question. This is just one of many questionable practices wihtin this code system.
More time is spent opening and closing files with singular records than anything else. I am not in a position to change this, I am just trying to assist with understanding how the current system works.

Update: There are duplicates for the second key.
Stephen Hoffman
2020-11-18 17:24:04 UTC
Reply
Permalink
Post by abrsvc
Situation: Indexed file with 2 keys, the first key no duplicates,
second key no duplicates. For clarity, the first key is a record
number and the second key is one of 4 values: A,B,L or blank.
With only four records possible, why is this even a file? Which
implies there *are* duplicates on the secondary?
I have over simplified the situation just to provide a context for the
question. This is just one of many questionable practices wihtin this
code system.
More time is spent opening and closing files with singular records than
anything else. I am not in a position to change this, I am just trying
to assist with understanding how the current system works.
Update: There are duplicates for the second key.
I've long assumed the order of record retrieval is indeterminate when
switching to a secondary key with duplicates; that's what is documented.

That secondary key would usually be created as a segmented key here, if
the primary is to (also) be involved in the ordering of secondary
record retrieval.
--
Pure Personal Opinion | HoffmanLabs LLC
abrsvc
2020-11-18 17:42:32 UTC
Reply
Permalink
Post by Stephen Hoffman
Post by abrsvc
Situation: Indexed file with 2 keys, the first key no duplicates,
second key no duplicates. For clarity, the first key is a record
number and the second key is one of 4 values: A,B,L or blank.
With only four records possible, why is this even a file? Which
implies there *are* duplicates on the secondary?
I have over simplified the situation just to provide a context for the
question. This is just one of many questionable practices wihtin this
code system.
More time is spent opening and closing files with singular records than
anything else. I am not in a position to change this, I am just trying
to assist with understanding how the current system works.
Update: There are duplicates for the second key.
I've long assumed the order of record retrieval is indeterminate when
switching to a secondary key with duplicates; that's what is documented.
That secondary key would usually be created as a segmented key here, if
the primary is to (also) be involved in the ordering of secondary
record retrieval.
--
Pure Personal Opinion | HoffmanLabs LLC
OK Thanks. That is what I thought. These are not segmented keys. I will look further, but posts here have supplied what I needed.

RE: The overall application design: There are numerous instances of files being used as message communication mechanisms with 1-10 records each. Not very efficient, but that is how this system was created. Many cases of the following sequence: Open file, update 1 field in a record, close file. Very crude synchronization technique as each open is in a loop such that if file is locked, go back and try again (forever...) until you get it open.

Many other such items throughout the code. I am not involved with "fixing' the problem, only to explain how it currently "works".
Stephen Hoffman
2020-11-18 18:41:10 UTC
Reply
Permalink
Post by abrsvc
OK Thanks. That is what I thought. These are not segmented keys. I
will look further, but posts here have supplied what I needed.
RE: The overall application design: There are numerous instances of
files being used as message communication mechanisms with 1-10 records
each. Not very efficient, but that is how this system was created.
Many cases of the following sequence: Open file, update 1 field in a
record, close file. Very crude synchronization technique as each open
is in a loop such that if file is locked, go back and try again
(forever...) until you get it open.
Many other such items throughout the code. I am not involved with
"fixing' the problem, only to explain how it currently "works".
Somebody's seemingly doing a port off of OpenVMS, eh?
--
Pure Personal Opinion | HoffmanLabs LLC
abrsvc
2020-11-18 18:59:38 UTC
Reply
Permalink
Post by Stephen Hoffman
OK Thanks. That is what I thought. These are not segmented keys. I
will look further, but posts here have supplied what I needed.
RE: The overall application design: There are numerous instances of
files being used as message communication mechanisms with 1-10 records
each. Not very efficient, but that is how this system was created.
Many cases of the following sequence: Open file, update 1 field in a
record, close file. Very crude synchronization technique as each open
is in a loop such that if file is locked, go back and try again
(forever...) until you get it open.
Many other such items throughout the code. I am not involved with
"fixing' the problem, only to explain how it currently "works".
Somebody's seemingly doing a port off of OpenVMS, eh?
--
Pure Personal Opinion | HoffmanLabs LLC
Regretably yes, but not because of VMS itself. There is specific hardware tied to VAX (yes, I know), that is no longer available and spares are becoming scarce.

The project is a factory control system that is better suited to PLC systems with a backend driver/collector. Sadly, while a VMS solution would be easier overall in concert with PLCs, there is limited VMS experience at the site with that experience slated for retirement in the next year or so. PLCs and some X86 servers will replace the old VAX systems.
Dave Froble
2020-11-18 17:51:59 UTC
Reply
Permalink
Post by Stephen Hoffman
Post by abrsvc
Situation: Indexed file with 2 keys, the first key no duplicates,
second key no duplicates. For clarity, the first key is a record
number and the second key is one of 4 values: A,B,L or blank.
With only four records possible, why is this even a file? Which
implies there *are* duplicates on the secondary?
I have over simplified the situation just to provide a context for the
question. This is just one of many questionable practices wihtin this
code system.
More time is spent opening and closing files with singular records
than anything else. I am not in a position to change this, I am just
trying to assist with understanding how the current system works.
Update: There are duplicates for the second key.
I've long assumed the order of record retrieval is indeterminate when
switching to a secondary key with duplicates; that's what is documented.
That secondary key would usually be created as a segmented key here, if
the primary is to (also) be involved in the ordering of secondary record
retrieval.
Ok, some RMS trivia.

The primary key is an ISAM structure. If I remember correctly, the data
records are sequenced (poor term) by the primary key order in the tree
structure. Basically the primary key and the data record are the same
piece of data. (Makes insertions interesting.)

However, all secondary key structures do not follow the scheme of one
key for each record. Instead, for each unique secondary key, there is
one key "record", in the ISAM tree structure, with a list of the records
with that value for the secondary key. Not sure how new entries are
added, most likely in the front of the list, or the end of the list, I'd
guess the end. I'd assume a linked list.

Not the way I'd design it. But I wasn't asked. My database product has
one key record in each key structure for each data record. Just the way
I approached the implementation.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Dave Froble
2020-11-18 17:40:05 UTC
Reply
Permalink
Post by abrsvc
Situation: Indexed file with 2 keys, the first key no duplicates,
second key no duplicates. For clarity, the first key is a record
number and the second key is one of 4 values: A,B,L or blank.
With only four records possible, why is this even a file? Which
implies there *are* duplicates on the secondary?
--
Pure Personal Opinion | HoffmanLabs LLC
I have over simplified the situation just to provide a context for the question. This is just one of many questionable practices wihtin this code system.
More time is spent opening and closing files with singular records than anything else. I am not in a position to change this, I am just trying to assist with understanding how the current system works.
Update: There are duplicates for the second key.
Ok, then there can be but one record for each unique primary key.

Each record can have a secondary key of A, B, L, or blank.

That's it.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Jan-Erik Söderholm
2020-11-18 18:22:05 UTC
Reply
Permalink
Post by Dave Froble
Post by abrsvc
Situation: Indexed file with 2 keys, the first key no duplicates,
second key no duplicates. For clarity, the first key is a record
number and the second key is one of 4 values: A,B,L or blank.
With only four records possible, why is this even a file? Which
implies there *are* duplicates on the secondary?
--
Pure Personal Opinion | HoffmanLabs LLC
I have over simplified the situation just to provide a context for the
question.  This is just one of many questionable practices wihtin this
code system.
More time is spent opening and closing files with singular records than
anything else.  I am not in a position to change this, I am just trying
to assist with understanding how the current system works.
Update:  There are duplicates for the second key.
Ok, then there can be but one record for each unique primary key.
Each record can have a secondary key of A, B, L, or blank.
That's it.
And if you read by the first key, you will get one specific record.
If you read by the second key, you will get *any* record having
that second key value, no matter what your read before, right?

I think the root question was if RMS keeps a file position between the
two, so that the second read depends on the first, in some way. Like
"the first record after the record read in the first read", or similar.

But, I do not follow the logic fully anyway... :-)
Arne Vajhøj
2020-11-18 19:34:48 UTC
Reply
Permalink
Post by abrsvc
Situation: Indexed file with 2 keys, the first key no duplicates,
second key no duplicates. For clarity, the first key is a record
number and the second key is one of 4 values: A,B,L or blank.
With only four records possible, why is this even a file? Which
implies there *are* duplicates on the secondary?
Update: There are duplicates for the second key.
You did not tell what API.

I am too lazy to code it using direct RMS calls, but
Pascal is easier:

$ type keys.pas
program keys(input, output);

type
fix50str = packed array [1..50] of char;
arr10int = array [1..10] of integer;
data = record
k1 : [KEY(0,NODUPLICATES)] integer;
k2 : [KEY(1, DUPLICATES)] char;
v : fix50str;
end;

var
f : file of data;
d : data;
ids : arr10int value [ 1:6; 2:2; 3:7; 4:1; 5:8; 6:3; 7:9; 8:10;
9:5; 10:4 ];
i, id : integer;
letter : char;

begin
open(f, 'keys.isq', new, organization := indexed, access_method :=
keyed);
for i := 1 to 10 do begin
id := ids[i];
letter := chr(65 + (i mod 3));
d.k1 := id;
d.k2 := letter;
d.v := 'Record #' + dec(i, 4);
f^ := d;
put(f);
end;
for i := 1 to 10 do begin
id := ids[i];
findk(f, 0, id);
if not ufb(f) then begin
d := f^;
writeln('lookup k1=', id:2, ' : ', d.k1:2, ' ', d.k2, ' ',
d.v);
letter := d.k2;
findk(f, 1, letter);
if not ufb(f) then begin
d := f^;
writeln('lookup k2= ', letter, ' : ', d.k1:2, ' ',
d.k2, ' ', d.v);
end;
end;
end;
close(f);
end.
$ pas keys
$ lin keys
$ run keys
lookup k1= 6 : 6 B Record #0001
lookup k2= B : 6 B Record #0001
lookup k1= 2 : 2 C Record #0002
lookup k2= C : 2 C Record #0002
lookup k1= 7 : 7 A Record #0003
lookup k2= A : 7 A Record #0003
lookup k1= 1 : 1 B Record #0004
lookup k2= B : 6 B Record #0001
lookup k1= 8 : 8 C Record #0005
lookup k2= C : 2 C Record #0002
lookup k1= 3 : 3 A Record #0006
lookup k2= A : 7 A Record #0003
lookup k1= 9 : 9 B Record #0007
lookup k2= B : 6 B Record #0001
lookup k1=10 : 10 C Record #0008
lookup k2= C : 2 C Record #0002
lookup k1= 5 : 5 A Record #0009
lookup k2= A : 7 A Record #0003
lookup k1= 4 : 4 B Record #0010
lookup k2= B : 6 B Record #0001

Arne
Bill Gunshannon
2020-11-18 17:14:28 UTC
Reply
Permalink
Situation:  Indexed file with 2 keys, the first key no duplicates,
second key no duplicates.  For clarity, the first key is a record
number and the second key is one of 4 values: A,B,L or blank.
With only four records possible, why is this even a file?  Which implies
there *are* duplicates on the secondary?
First thought that came to my mind. :-)

bill
Hein RMS van den Heuvel
2020-11-18 22:45:23 UTC
Reply
Permalink
Post by abrsvc
I was asked the following question and am not 100% sure of the answer, so...
Situation: Indexed file with 2 keys, the first key no duplicates, second key no duplicates. For clarity, the first key is a record number and the second key is one of 4 values: A,B,L or blank.
This was later adjusted by OP - There are duplicates on secondary keys.
Post by abrsvc
A read is posted using the primary key. The next read uses the secondary key say of value A. Does the second read select the matching record within the context of the primary key or just the first record encountered with a matching secondary key?
I believe that the record context remains. Am I correct?
Noop. Incorrect. A keyed record lookup establishes a fresh context.
The 'first' record with that alternate key will be returned every time no matter which record was found by primary key.
Any 'next' read - sequential get - will return the next record by that last used index.

If you wanted the 'next' duplicate value for an alternate index to read by order of the primary (or other key) then as per Hoff once could add a 'segment' to the key bytes essentially de-duplicating it.
The writer clarified that this is not needed, as he just wants to confirm what is happening today.
If this change is ever desired then please know that such change can often, but not always, be made to the file without changing the programs accessing the files.
Post by abrsvc
Dave Frobble trivia....
Except for Prolog-1 files, the primary key bytes are actually isolated into the first byte of the record, but yeah it travels with the data.

For alternate keys with duplicates there is indeed an array of pointers for each unique key value.
The pointer are added to end, thus the record which was first added will be returned first, but a CONVERT can and will change that.
The pointers are called RRV's Record Retrieval Vectors which essential equate RFA's : VBN + Record ID number plus a flag byte.
It is a simple list, constraint by the bucket size.
When a bucket fills with pointers (7 bytes each - so a good 1000 per *KB (16 block buckets) a while new bucket is added to the list and started out with the (duplicated) key value and its own array.
I've worked with files having millions of duplicates and thus thousands of continutation buckets.
For such files, a new duplicate insert will require thousands of (cached) reads before a single write can happen... each time. Ouch.
Post by abrsvc
Arne's Pascal RMS example.
Nicely done Arne.
Myself I prefer to use DCL to demonstrate just about any RMS problem.
For example here:

$ create tmp.idx/fdl="file; org ind; key 0; seg0_l 4; key 1; dupl yes; seg0_p 5; seg0_l 1"
$ conver/merge tt: tmp.idx
noot A
vuur B
aap A
mies B
Exit
$ type tmp.idx
aap A
mies B
noot A
vuur B
$
$
$ read/ind=0/key=" aap" tmp record
$ show symb record
RECORD = " aap A"
$ read/ind=1/key="A" tmp record
$ show symb record
RECORD = "noot A"

Enjoy!
Hein.
abrsvc
2020-11-18 23:13:04 UTC
Reply
Permalink
Post by Hein RMS van den Heuvel
Post by abrsvc
I was asked the following question and am not 100% sure of the answer, so...
Situation: Indexed file with 2 keys, the first key no duplicates, second key no duplicates. For clarity, the first key is a record number and the second key is one of 4 values: A,B,L or blank.
This was later adjusted by OP - There are duplicates on secondary keys.
Post by abrsvc
A read is posted using the primary key. The next read uses the secondary key say of value A. Does the second read select the matching record within the context of the primary key or just the first record encountered with a matching secondary key?
I believe that the record context remains. Am I correct?
Noop. Incorrect. A keyed record lookup establishes a fresh context.
The 'first' record with that alternate key will be returned every time no matter which record was found by primary key.
Any 'next' read - sequential get - will return the next record by that last used index.
If you wanted the 'next' duplicate value for an alternate index to read by order of the primary (or other key) then as per Hoff once could add a 'segment' to the key bytes essentially de-duplicating it.
The writer clarified that this is not needed, as he just wants to confirm what is happening today.
If this change is ever desired then please know that such change can often, but not always, be made to the file without changing the programs accessing the files.
Post by abrsvc
Dave Frobble trivia....
Except for Prolog-1 files, the primary key bytes are actually isolated into the first byte of the record, but yeah it travels with the data.
For alternate keys with duplicates there is indeed an array of pointers for each unique key value.
The pointer are added to end, thus the record which was first added will be returned first, but a CONVERT can and will change that.
The pointers are called RRV's Record Retrieval Vectors which essential equate RFA's : VBN + Record ID number plus a flag byte.
It is a simple list, constraint by the bucket size.
When a bucket fills with pointers (7 bytes each - so a good 1000 per *KB (16 block buckets) a while new bucket is added to the list and started out with the (duplicated) key value and its own array.
I've worked with files having millions of duplicates and thus thousands of continutation buckets.
For such files, a new duplicate insert will require thousands of (cached) reads before a single write can happen... each time. Ouch.
Post by abrsvc
Arne's Pascal RMS example.
Nicely done Arne.
Myself I prefer to use DCL to demonstrate just about any RMS problem.
$ create tmp.idx/fdl="file; org ind; key 0; seg0_l 4; key 1; dupl yes; seg0_p 5; seg0_l 1"
$ conver/merge tt: tmp.idx
noot A
vuur B
aap A
mies B
Exit
$ type tmp.idx
aap A
mies B
noot A
vuur B
$
$
$ read/ind=0/key=" aap" tmp record
$ show symb record
RECORD = " aap A"
$ read/ind=1/key="A" tmp record
$ show symb record
RECORD = "noot A"
Enjoy!
Hein.
Taking your statement as listed below and using your tmp.idx file as an example. Wouldn't the returned record for the keyed read using the secondary key return the aap A record since it is the first in the file with the secondary key of A rather than the first A record after the initial read with key 0?


The 'first' record with that alternate key will be returned every time no matter which record was found by primary key.
Stephen Hoffman
2020-11-19 00:14:52 UTC
Reply
Permalink
Post by abrsvc
Taking your statement as listed below and using your tmp.idx file as an
example. Wouldn't the returned record for the keyed read using the
secondary key return the aap A record since it is the first in the file
with the secondary key of A rather than the first A record after the
initial read with key 0?
Hein is assuming a key search on the secondary key to start the
sequential access on the secondary key, and that'd give you whatever is
the first A record if that's what was requested, as was mentioned.

From your key-switch comments, I'm assuming that's not quite what this
code is doing. Rather, I'm assuming this code is doing a keyed search
on the primary, then switching to and doing sequential $get operations
on the secondary—sequential $get, not a keyed-access $get or
keyed-access $find—from whatever record you're already positioned at on
the secondary.

If you're doing a sequential $get from what is effectively an
indeterminate position within any group of duplicates present on the
secondary key—again, a starting position within the duplicates present
on the secondary key, as determined by what is an unrelated primary key
$get or $find access—what record might you sequentially $get on the
secondary key is indeterminate.

The ordering for the primary key is indeterminate when accessing
records sequentially on the secondary key.

It is entirely possible that your sequential $get on the secondary key
could start from an "A" record and the sequential $get on the secondary
key could then return a "B" record, if you happened to start on the
last A among the duplicates.

If a developer wanted to reliably and efficiently acquire a selection
of records associated with the primary key, you'll need to extend the
primary key to include that field or (more commonly, as the data might
not be adjacent) define the secondary key as segmented key that
overlaps with the primary and the field containing that "A", "B", "L"
or " " value.

RMS files that get churned with added and deleted records can to build
up trash within the indexed structures. Performance issues are common
here as the detritus accumulates.

There are various ways for inefficiencies and coding bugs to be added
here, too.
--
Pure Personal Opinion | HoffmanLabs LLC
abrsvc
2020-11-19 03:02:56 UTC
Reply
Permalink
Post by Stephen Hoffman
Post by abrsvc
Taking your statement as listed below and using your tmp.idx file as an
example. Wouldn't the returned record for the keyed read using the
secondary key return the aap A record since it is the first in the file
with the secondary key of A rather than the first A record after the
initial read with key 0?
Hein is assuming a key search on the secondary key to start the
sequential access on the secondary key, and that'd give you whatever is
the first A record if that's what was requested, as was mentioned.
From your key-switch comments, I'm assuming that's not quite what this
code is doing. Rather, I'm assuming this code is doing a keyed search
on the primary, then switching to and doing sequential $get operations
on the secondary—sequential $get, not a keyed-access $get or
keyed-access $find—from whatever record you're already positioned at on
the secondary.
If you're doing a sequential $get from what is effectively an
indeterminate position within any group of duplicates present on the
secondary key—again, a starting position within the duplicates present
on the secondary key, as determined by what is an unrelated primary key
$get or $find access—what record might you sequentially $get on the
secondary key is indeterminate.
The ordering for the primary key is indeterminate when accessing
records sequentially on the secondary key.
It is entirely possible that your sequential $get on the secondary key
could start from an "A" record and the sequential $get on the secondary
key could then return a "B" record, if you happened to start on the
last A among the duplicates.
If a developer wanted to reliably and efficiently acquire a selection
of records associated with the primary key, you'll need to extend the
primary key to include that field or (more commonly, as the data might
not be adjacent) define the secondary key as segmented key that
overlaps with the primary and the field containing that "A", "B", "L"
or " " value.
RMS files that get churned with added and deleted records can to build
up trash within the indexed structures. Performance issues are common
here as the detritus accumulates.
There are various ways for inefficiencies and coding bugs to be added
here, too.
--
Pure Personal Opinion | HoffmanLabs LLC
That makes sense. So as I thought I stated: If a read is done using a primary key first then a read using a secondary key, the "starting point" is the position within the file based upon the first read such that read #2 will return the first record with the matching secondary key that appears AFTER the first record read rather than the first record in the file that matches the secondary key.
Hein RMS van den Heuvel
2020-11-19 04:50:35 UTC
Reply
Permalink
That makes sense. So as I thought I stated: If a read is done using a primary key first then a read using a secondary key, the "starting >point" is the position within the file based upon the first read such that read #2 will return the first record with the matching >secondary key that appears AFTER the first record read rather than the first record in the file that matches the secondary key.
Again, as I wrote before ---- Noop. Incorrect. A keyed record lookup establishes a fresh context.

What part of that do you not understand?
example. Wouldn't the returned record for the keyed read using the
secondary key return the aap A record since it is the first in the file
with the secondary key of A
NOOP. From the example:

$ conver/merge tt: tmp.idx
noot A
vuur B
aap A

Therefor the first record inserted ( BY TIME! ) with key A was 'noot', not ' aap'
The first record - by primary key - is ' aap' but that was not the first alternate key 'A' to be inserted.

Hoff wrote>> Hein is assuming a key search on the secondary key to start the sequential access on the secondary key,

I assume nothing. I simply know how it is. That's a Dutch thing. Much like: https://images.app.goo.gl/QhdvQ7CvxordwGEj6
Key search is NOT sequential search. Sequential access and context is established by either
1) a $REWIND or $CONNECT with KRF=xxx , pointing to the first row by that key, and with duplicates the first (by time) inserted with that first key.
2) a $GET with KRF=xxx, pointing to the first row with alternate key satisfying the key ( KBF,KSZ,RAC)... independent on any other action/context on that stream ($CONNECT)

Hoff>> then switching to and doing sequential $get operations on the secondary

That functionality does NOT exist in RMS

Hoff>> what record might you sequentially $get on the secondary key is indeterminate.

Noop. You'll get the next record by primary key as you cannot switch key context on sequential access as per above.

Cheers,
Hein
abrsvc
2020-11-19 12:12:29 UTC
Reply
Permalink
That makes sense. So as I thought I stated: If a read is done using a primary key first then a read using a secondary key, the "starting >point" is the position within the file based upon the first read such that read #2 will return the first record with the matching >secondary key that appears AFTER the first record read rather than the first record in the file that matches the secondary key.
Again, as I wrote before ---- Noop. Incorrect. A keyed record lookup establishes a fresh context.
What part of that do you not understand?
example. Wouldn't the returned record for the keyed read using the
secondary key return the aap A record since it is the first in the file
with the secondary key of A
$ conver/merge tt: tmp.idx
noot A
vuur B
aap A
Therefor the first record inserted ( BY TIME! ) with key A was 'noot', not ' aap'
The first record - by primary key - is ' aap' but that was not the first alternate key 'A' to be inserted.
Hoff wrote>> Hein is assuming a key search on the secondary key to start the sequential access on the secondary key,
I assume nothing. I simply know how it is. That's a Dutch thing. Much like: https://images.app.goo.gl/QhdvQ7CvxordwGEj6
Key search is NOT sequential search. Sequential access and context is established by either
1) a $REWIND or $CONNECT with KRF=xxx , pointing to the first row by that key, and with duplicates the first (by time) inserted with that first key.
2) a $GET with KRF=xxx, pointing to the first row with alternate key satisfying the key ( KBF,KSZ,RAC)... independent on any other action/context on that stream ($CONNECT)
Hoff>> then switching to and doing sequential $get operations on the secondary
That functionality does NOT exist in RMS
Hoff>> what record might you sequentially $get on the secondary key is indeterminate.
Noop. You'll get the next record by primary key as you cannot switch key context on sequential access as per above.
Cheers,
Hein
Got it. Thanks for the clarification.
Stephen Hoffman
2020-11-19 15:47:03 UTC
Reply
Permalink
...with duplicates the first (by time) inserted with that first key...
Which I've incorrectly referred to as "indeterminate" here, though I've
approximately never been seeking duplicate records returned in
insertion-time order.

The whole of the RMS user doc is problematic too, but then I'm in a
polite mood today.
--
Pure Personal Opinion | HoffmanLabs LLC
Hein RMS van den Heuvel
2020-11-19 17:36:47 UTC
Reply
Permalink
Post by Stephen Hoffman
...with duplicates the first (by time) inserted with that first key...
Which I've incorrectly referred to as "indeterminate" here, though I've
approximately never been seeking duplicate records returned in
insertion-time order.
You have... Remember VMSmail? The FOLDER name? That's an alternate key with duplicates!
So a DIR/FOL=xxx returns messages in the order you'd expect.
Least surprise engineering.

Business applications might have a 'status' flag say 'A' for Action, 'P' for Print and ' ' <space> for 'done' with a NULL key value of space.
The application can subsequently ask for the first A row to be processed or P row to be printed.
It can be useful, but it is not very robust.

To build on my earlier example, see what CONVERT does to the file:

$ conv/key=1 tmp.idx tt:/fdl=nl:
noot A
aap A
vuur B
mies B
$ conv tmp.idx new.idx
$ conv/key=1 new.idx tt:/fdl=nl:
aap A
noot A
mies B
vuur B

Mind you for VMSmail, this caused no issue because the PK was a timestamp and thus convert delivered messages in folders by time order.
Many, if not most, business application essentially hand out PK's in always increasing order again making the right thing happen in general, but not guaranteed. I'm sure this has caused head-scratching or perhaps support calls over time.

Fun?
Hein
abrsvc
2020-11-19 17:48:48 UTC
Reply
Permalink
Post by Hein RMS van den Heuvel
Post by Stephen Hoffman
...with duplicates the first (by time) inserted with that first key...
Which I've incorrectly referred to as "indeterminate" here, though I've
approximately never been seeking duplicate records returned in
insertion-time order.
You have... Remember VMSmail? The FOLDER name? That's an alternate key with duplicates!
So a DIR/FOL=xxx returns messages in the order you'd expect.
Least surprise engineering.
Business applications might have a 'status' flag say 'A' for Action, 'P' for Print and ' ' <space> for 'done' with a NULL key value of space.
The application can subsequently ask for the first A row to be processed or P row to be printed.
It can be useful, but it is not very robust.
noot A
aap A
vuur B
mies B
$ conv tmp.idx new.idx
aap A
noot A
mies B
vuur B
Mind you for VMSmail, this caused no issue because the PK was a timestamp and thus convert delivered messages in folders by time order.
Many, if not most, business application essentially hand out PK's in always increasing order again making the right thing happen in general, but not guaranteed. I'm sure this has caused head-scratching or perhaps support calls over time.
Fun?
Hein
Yes its fun...

Once again I see that an application works by accident rather than intent. This is not the first nor I suspect, will it be the last.
Stephen Hoffman
2020-11-19 18:48:18 UTC
Reply
Permalink
Post by Hein RMS van den Heuvel
Mind you for VMSmail, this caused no issue because the PK was a
timestamp and thus convert delivered messages in folders by time order.
Message arrival order is not the same as timestamp order, with MAIL. A
disparity lurks within the MAIL handling of folders, though that was
ignored as it's only very rarely encountered and benign.
Post by Hein RMS van den Heuvel
Many, if not most, business application essentially hand out PK's in
always increasing order again making the right thing happen in general,
but not guaranteed. I'm sure this has caused head-scratching or perhaps
support calls over time.
Fun?
Fun that can arise in these cases:
...German Tank Problem.
...Daylight Saving Time. Which is the source of the above
timestamp-versus-arrival-sequence bug.
...Filenaming order and directory sorting, as per an existing MAIL mess.


RMS doc needs an overhaul and updates, and cookbooks. It's scattered
all over, at present.

OpenVMS needs better built-in storage options, such as SQLite,
discussions of The One True Database (Rdb) aside.
--
Pure Personal Opinion | HoffmanLabs LLC
Phillip Helbig (undress to reply)
2020-11-19 20:14:57 UTC
Reply
Permalink
Post by Hein RMS van den Heuvel
You have... Remember VMSmail?
I've been using VMS MAIL for almost 30 years. I have experience with
Outlook, Lotus Notes, elm, pine, various web-based email stuff, unix
mail, and so on. VMS MAIL makes stuff I do often easy, some of which
might not even be possible with other programs, for example

MAIL> DIR/SINCE=<date1>/FROM=<address>/SUB=<subject>/BEF=<date2>/TO=<address2>

and so on. The fact that the headers and start of each message are in
the indexed file makes it quick to page through stuff; the rest of the
message in the external file is read only if needed.

I have many mail files in several directories on a few disks. I can use
VMS MAIL files from other systems just via SET FILE after copying them
tomy cluster.

Years ago, I read about the following interesting idea here. Someone
liked to keep all the messages in one MAIL file (rather than having many
MAIL files as I do) so that he could use SEARCH and various DIRECTORY
commands to find anything. However, this became a problem when the
number of external MAIL$* files became too many for good directory
performance. So he created subdirectories based on years (and perhaps
months) and renamed the MAIL$* files to the appropriate directories. He
then used SET FILE/ENTER to create a hard link to the main MAIL file in
each subdirectory. He could then use DIR to find what he wanted and if
MAIL said that the external file couldn't be found, just did SET FILE
[.<subdirector>] and then it could be found.
Stephen Hoffman
2020-11-19 21:01:56 UTC
Reply
Permalink
Post by Phillip Helbig (undress to reply)
I've been using VMS MAIL for almost 30 years. I have experience with
Outlook, Lotus Notes, elm, pine, various web-based email stuff, unix
mail, and so on. VMS MAIL makes stuff I do often easy, some of which
might not even be possible with other programs, for example
MAIL>
DIR/SINCE=<date1>/FROM=<address>/SUB=<subject>/BEF=<date2>/TO=<address2>
For those times when this is required more than once, macOS Mail allows
"folders" to be established which show only the required combination.
These "folders" are effectively saved search criteria, a concept which
doesn't exist in the OpenVMS MAIL tooling. There are other places where
searches can be saved, including in Finder, the macOS analog to the
OpenVMS DECwindows FileView app. And when you're live-entering the
equivalent of your search (subject:subj to:***@example.com
from:***@example.com, etc) you're offered the opportunity to further
tailor and to save that search for future use. This syntax isn't used
very often within macOS Mail, but it does exist.

In addition to macOS Mail and its saved searches, here's how to
configure saved searches on Microsoft Office on Mac and on Windows:
https://support.microsoft.com/en-us/office/save-a-search-by-using-a-smart-folder-in-outlook-for-mac-fd86c4dc-8fc3-495e-8a28-5f670ebfa393

https://www.makeuseof.com/tag/set-windows-smart-folders-saving-searches/

macOS Spotlight search is far more flexible than are SEARCH-based and
DIRECTORY-based searches, and Spotlight is available both in the
graphical interface and at the command line. And can be scripted.

mdfind is part of the command-line interface for Spotlight:
https://ss64.com/osx/mdfind.html

Spotlight and searching capabilities are among the features that I miss
the most when working with OpenVMS. The SEARCH-related commands and
DIRECTORY-based search commands are just... so... slow...

As for your preferences here? You do not need to justify what works for
you. Or what music you happen to prefer. Not at all. That's entirely
your call.
--
Pure Personal Opinion | HoffmanLabs LLC
Phillip Helbig (undress to reply)
2020-11-19 22:06:43 UTC
Reply
Permalink
Post by Stephen Hoffman
In addition to macOS Mail and its saved searches, here's how to
Interesting. But HELP DIR is probably quicker as far as learning how to
set it up goes.

I have no experience with macOS Mail. I do have an iPad, and my wife
has a MacBook Air, but neither of us use any mail tools on those
platforms (yet).
Post by Stephen Hoffman
Spotlight and searching capabilities are among the features that I miss
the most when working with OpenVMS. The SEARCH-related commands and
DIRECTORY-based search commands are just... so... slow...
Don't forget MLSEARCH, which is a command-line utility (i.e. it runs an
executable) written in BLISS and Fortran and adds stuff such as
wildcards.
Stephen Hoffman
2020-11-19 22:49:00 UTC
Reply
Permalink
Post by Phillip Helbig (undress to reply)
Post by Stephen Hoffman
In addition to macOS Mail and its saved searches, here's how to
Interesting. But HELP DIR is probably quicker as far as learning how
to set it up goes.
For you and given your experience with OpenVMS and given your
preferences for OpenVMS, no doubt.

For others not sharing that? Doubtful.

With macOS Mail or macOS Finder, when you type a search in the search
box you're offered to Save, and when you Save you're then presented
with a dialog that lets you build or modify the saved search. Or you
can request the dialog to build the search directly; Mail offers a
dialog item for a Smart Mailbox, and Finder offers a dialog item for a
Smart Folder; each is a saved search.
Post by Phillip Helbig (undress to reply)
I have no experience with macOS Mail. I do have an iPad, and my wife
has a MacBook Air, but neither of us use any mail tools on those
platforms (yet).
Assuming the use of VSI or HPE TCP/IP Services, you'll have some
difficulty connecting to OpenVMS mail from the macOS and iPadOS Mail
client tools due to the lack of connection security on OpenVMS; you're
limited to plaintext.

macOS will allow that connection, but will be cranky about the lack of
security. I haven't needed to configure an insecure connection from
either in some years, though. macOS and iPadOS connections to mail
servers with TLS support are entirely feasible.

Process Multinet is better at connection security, though VSI TCP/IP
seems to have dropped that stack from the roadmap. The current roadmap
shows updates (seemingly) to the existing HPE-derived TCP/IP Services
stack, omitting mention of the VSI fork of Process Multinet.

I usually configure the OpenVMS TCP/IP Services mail server to gateway
outbound messages through another local or hosted server—to the local
Exchange Server or Postfix server or or Azure or otherwise—as that
avoids various connectivity issues, and easier with the mail stored on
Exchange Server or Postfix or suchlike. The recipient server does need
to be configured to allow insecure relay access, as otherwise the
OpenVMS connections will be rejected.
Post by Phillip Helbig (undress to reply)
Post by Stephen Hoffman
Spotlight and searching capabilities are among the features that I miss
the most when working with OpenVMS. The SEARCH-related commands and
DIRECTORY-based search commands are just... so... slow...
Don't forget MLSEARCH, which is a command-line utility (i.e. it runs an
executable) written in BLISS and Fortran and adds stuff such as
wildcards.
More recent search designs use caching; a combination of file system
change notifications—closest analog for that on OpenVMS is a file
access ACL, and there's no means to trigger an AST from that short of
an obscure and seldom-used security mailbox operation—and build up a
cache of the search data. This data then makes searches massively
quicker than any search that traverses and performs the collection from
the search target in response to the search command; faster than
DIRECTORY, SEARCH, find, or such, as the tool has its metadata already
available, without having to synchronously traverse the target storage.
--
Pure Personal Opinion | HoffmanLabs LLC
Phillip Helbig (undress to reply)
2020-11-19 20:17:50 UTC
Reply
Permalink
Post by Phillip Helbig (undress to reply)
I've been using VMS MAIL for almost 30 years. I have experience with
Outlook, Lotus Notes, elm, pine, various web-based email stuff, unix
mail, and so on. VMS MAIL makes stuff I do often easy, some of which
might not even be possible with other programs, for example
MAIL> DIR/SINCE=<date1>/FROM=<address>/SUB=<subject>/BEF=<date2>/TO=<address2>
and so on. The fact that the headers and start of each message are in
the indexed file makes it quick to page through stuff; the rest of the
message in the external file is read only if needed.
And I can write in EDT. And there is the keypad for MAIL commands. And
I can define my own keypad. And proper HELP.

I like the fact that NEWSRDR tries to be like VMS MAIL.
Michael Moroney
2020-11-20 00:07:10 UTC
Reply
Permalink
For whatever it's worth, keys in RMS index files can overlap and be segmented.
It's possible for the primary key to be a record number and the secondary key
be the record number with a character (A,B,L, blank) appended to the end. Both
would have no duplicates but both would be of the same type, likely a character
string. You could have a third overlapping key which is just the (A,B,L, blank)
if you wanted to find all the "B" records for example.

Hardly a gee-whiz Oracle database but it can do a little more than most
people realize.

Not that it matters since the OP stated this was a migration away from VMS.
abrsvc
2020-11-20 02:41:13 UTC
Reply
Permalink
Post by Michael Moroney
For whatever it's worth, keys in RMS index files can overlap and be segmented.
It's possible for the primary key to be a record number and the secondary key
be the record number with a character (A,B,L, blank) appended to the end. Both
would have no duplicates but both would be of the same type, likely a character
string. You could have a third overlapping key which is just the (A,B,L, blank)
if you wanted to find all the "B" records for example.
Hardly a gee-whiz Oracle database but it can do a little more than most
people realize.
Not that it matters since the OP stated this was a migration away from VMS.
I personally understand the use of segmented keys as suggested. This application was created almost 40 years ago and was not very efficient. It uses files as data passing mechanisms and rather than locking, it uses file opens and closes cycling on the open if locked until the open actually happens. This is not the least efficient mechanism used either. Indexed files with one record is common too. ARG!! Anyway, this application could have run on a system 75% smaller than it is and still have extra capacity had it been written correctly. But as stated, the effort here is to understand the underlying flow such that it can be replaced with more modern hardware. I got involved way to late to influence the decision. OpenVMS could easily have remained with some interfacing to PLCs for hardware control, but too late.

Before this thread gets any further from the initial questions, I appreciate the discussion and clarification from all involved.
Loading...