Discussion:
Something is happening at VSI
(too old to reply)
John H. Reinhardt
2024-04-01 03:33:22 UTC
Permalink
I've been checking the VSI Service Platform to see if the x86 Community PAK got renewed over the weekend just in case they decided to put out something to keep my x86 systems from expiring Monday (or will things stop working at midnight? I've ever been around when a PAK expired). All weekend there was no change - until tonight. It's 10:20 CDT and not even midnight Eastern time but my package listing is blank. It looks like all access to x86 packages there has been revoked. At least to me.

Both letters I got from VSI mentioned continuing x86 Community Licensing and I thought I saw someplace that said if you already had an x86 CLP then they would automatically send an email with the new credentials/setup but re-reading both I don't see that so I'm not sure if I imagined it or it was someplace else or that my reading comprehension failed.

It will be interesting to see what April Fool's Day brings.
--
John H. Reinhardt
Single Stage to Orbit
2024-04-01 09:29:17 UTC
Permalink
Post by John H. Reinhardt
I've been checking the VSI Service Platform to see if the x86
Community PAK got renewed over the weekend just in case they decided
to put out something to keep my x86 systems from expiring Monday (or
will things stop working at midnight? I've ever been around when a
PAK expired). All weekend there was no change - until tonight.  It's
10:20 CDT and not even midnight Eastern time but my package listing
is blank.  It looks like all access to x86 packages there has been
revoked.  At least to me.
Both letters I got from VSI mentioned continuing x86 Community
Licensing and I thought I saw someplace that said if you already had
an x86 CLP then they would automatically send an email with the new
credentials/setup but re-reading both I don't see that so I'm not
sure if I imagined it or it was someplace else or that my reading
comprehension failed.
It will be interesting to see what April Fool's Day brings.
Same here. Waiting to see what happens next.
--
Tactical Nuclear Kittens
John H. Reinhardt
2024-04-01 09:49:46 UTC
Permalink
Post by Single Stage to Orbit
Post by John H. Reinhardt
I've been checking the VSI Service Platform to see if the x86
Community PAK got renewed over the weekend just in case they decided
to put out something to keep my x86 systems from expiring Monday (or
will things stop working at midnight? I've ever been around when a
PAK expired). All weekend there was no change - until tonight.  It's
10:20 CDT and not even midnight Eastern time but my package listing
is blank.  It looks like all access to x86 packages there has been
revoked.  At least to me.
Both letters I got from VSI mentioned continuing x86 Community
Licensing and I thought I saw someplace that said if you already had
an x86 CLP then they would automatically send an email with the new
credentials/setup but re-reading both I don't see that so I'm not
sure if I imagined it or it was someplace else or that my reading
comprehension failed.
It will be interesting to see what April Fool's Day brings.
Same here. Waiting to see what happens next.
PAK wise I guess we still have today (1-APR-2024) and things will stop when the date rolls over to 2-APR-2024. I couldn't remember if the termination date took effect ON the date or if that was the last day.
--
John H. Reinhardt
Dave Froble
2024-04-01 14:04:04 UTC
Permalink
Post by John H. Reinhardt
Post by Single Stage to Orbit
Post by John H. Reinhardt
I've been checking the VSI Service Platform to see if the x86
Community PAK got renewed over the weekend just in case they decided
to put out something to keep my x86 systems from expiring Monday (or
will things stop working at midnight? I've ever been around when a
PAK expired). All weekend there was no change - until tonight. It's
10:20 CDT and not even midnight Eastern time but my package listing
is blank. It looks like all access to x86 packages there has been
revoked. At least to me.
Both letters I got from VSI mentioned continuing x86 Community
Licensing and I thought I saw someplace that said if you already had
an x86 CLP then they would automatically send an email with the new
credentials/setup but re-reading both I don't see that so I'm not
sure if I imagined it or it was someplace else or that my reading
comprehension failed.
It will be interesting to see what April Fool's Day brings.
Same here. Waiting to see what happens next.
PAK wise I guess we still have today (1-APR-2024) and things will stop when the
date rolls over to 2-APR-2024. I couldn't remember if the termination date took
effect ON the date or if that was the last day.
One of our customers had a problem. Their PAKs expired in December, but the
system wasn't re-booted until a recent power outage. It appears that perhaps
the licenses continued to work, but, would not re-load upon re-boot.

Now, never re-booting isn't a solution. But perhaps the expired PAKs will work
until a re-boot, and failure to load, occurs. Then again there have been claims
of long uptimes on VMS.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Craig A. Berry
2024-04-01 11:58:46 UTC
Permalink
Post by John H. Reinhardt
Both letters I got from VSI mentioned continuing x86 Community Licensing
and I thought I saw someplace that said if you already had an x86 CLP
then they would automatically send an email with the new
credentials/setup but re-reading both I don't see that so I'm not sure
if I imagined it or it was someplace else or that my reading
comprehension failed.
What they said was they will send anyone who has previously applied for
an x86 license is a link to the new vmdk that already has software and
licenses on it. My understanding is that the service portal is going
away for CL users and what you get on the vmdk is what you'll have until
the next one comes out the following year.

Regarding Alpha and Integrity, the announcement on the web page says,
"Effective immediately, we will discontinue offering new community
licenses for non-commercial use for Alpha and Integrity. Existing
holders of community licenses for these architectures will get updates
for those licenses and retain their access to the Service Portal until
March 2025 for Alpha and December 2025 for Integrity."[1] Which is odd
since my community membership never got me access to anything related to
Alpha and Integrity on the portal.

[1] https://vmssoftware.com/about/news/2024-03-25-community-license-update/
Single Stage to Orbit
2024-04-01 12:46:23 UTC
Permalink
Existing holders of community licenses for these architectures will
get updates for those licenses and retain their access to the Service
Portal until March 2025 for Alpha and December 2025 for
Integrity."[1]  Which is odd since my community membership never got
me access to anything related to Alpha and Integrity on the portal.
I have access to the x86 portal but there is nothing there, no updates
or packages.
--
Tactical Nuclear Kittens
John H. Reinhardt
2024-04-01 14:05:58 UTC
Permalink
Post by Craig A. Berry
Post by John H. Reinhardt
Both letters I got from VSI mentioned continuing x86 Community Licensing and I thought I saw someplace that said if you already had an x86 CLP then they would automatically send an email with the new credentials/setup but re-reading both I don't see that so I'm not sure if I imagined it or it was someplace else or that my reading comprehension failed.
What they said was they will send anyone who has previously applied for
an x86 license is a link to the new vmdk that already has software and
licenses on it. My understanding is that the service portal is going
away for CL users and what you get on the vmdk is what you'll have until
the next one comes out the following year.
Regarding Alpha and Integrity, the announcement on the web page says,
"Effective immediately, we will discontinue offering new community
licenses for non-commercial use for Alpha and Integrity. Existing
holders of community licenses for these architectures will get updates
for those licenses and retain their access to the Service Portal until
March 2025 for Alpha and December 2025 for Integrity."[1]  Which is odd
since my community membership never got me access to anything related to
Alpha and Integrity on the portal.
[1] https://vmssoftware.com/about/news/2024-03-25-community-license-update/
Yes, that's where it was, on the web site. Thanks Craig!

At 8:34am CDT I received the email with the information and the download link for the vmdk. It's a zip file with a 5GB virtual disk. I'm in the process of uploading it to my ESXi host and creating a system with it. Also in the email is a link to a page describing how to set up a VM using VirtualBox. I'm just cloning one of the two VMS VM's I already have.
--
John H. Reinhardt
bill
2024-04-01 13:32:01 UTC
Permalink
Post by John H. Reinhardt
It will be interesting to see what April Fool's Day brings.
Maybe the April Fools were the ones who banked on life under
VMS continuing as it had in the past? :-(

bill
Jouk Jansen
2024-04-02 06:10:51 UTC
Permalink
Craig A. Berry wrote on 1-APR-2024 13:05:15.61
Post by Craig A. Berry
What they said was they will send anyone who has previously applied for
an x86 license is a link to the new vmdk that already has software and
licenses on it. My understanding is that the service portal is going
away for CL users and what you get on the vmdk is what you'll have until
the next one comes out the following year.
I did not get that E-mail (should have got it because I got CLP licenses).
Seems no vmdk for me.

Jouk



Pax, vel iniusta, utilior est quam iustissimum bellum.
(free after Marcus Tullius Cicero (106 b.Chr.-46 b.Chr.)
Epistularum ad Atticum 7.1.4.3)


Touch not the cat bot a glove
Post by Craig A. Berry
------------------------------------------------------------------------------<
Jouk Jansen

***@hrem.nano.tudelft.nl

Technische Universiteit Delft tttttttttt uu uu ddddddd
Kavli Institute of Nanoscience tttttttttt uu uu dd dd
Nationaal centrum voor HREM tt uu uu dd dd
Lorentzweg 1 tt uu uu dd dd
2628 CJ Delft tt uu uu dd dd
Nederland tt uu uu dd dd
tel. +31-15-2782272 tt uuuuuuu ddddddd
Post by Craig A. Berry
------------------------------------------------------------------------------<
Craig A. Berry
2024-04-02 12:57:48 UTC
Permalink
Post by Jouk Jansen
Craig A. Berry wrote on 1-APR-2024 13:05:15.61
Post by Craig A. Berry
What they said was they will send anyone who has previously applied for
an x86 license is a link to the new vmdk that already has software and
licenses on it. My understanding is that the service portal is going
away for CL users and what you get on the vmdk is what you'll have until
the next one comes out the following year.
I did not get that E-mail (should have got it because I got CLP licenses).
Seems no vmdk for me.
I have not gotten such an e-mail either.
mjos_examine
2024-04-02 14:22:53 UTC
Permalink
Post by Craig A. Berry
Post by Jouk Jansen
I did not get that E-mail (should have got it because I got CLP licenses).
Seems no vmdk for me.
I have not gotten such an e-mail either.
It sounds like there may be many of us that have not received that email
as of yet, so we're kind of stuck.

Whether our emails are queued up, or we've been overlooked, we have no
way of knowing at this point.
motk
2024-04-02 23:10:35 UTC
Permalink
Post by mjos_examine
Post by Craig A. Berry
Post by Jouk Jansen
I did not get that E-mail (should have got it because I got CLP licenses).
Seems no vmdk for me.
I have not gotten such an e-mail either.
It sounds like there may be many of us that have not received that email
as of yet, so we're kind of stuck.
Whether our emails are queued up, or we've been overlooked, we have no
way of knowing at this point.
I didn't receive an email for seven months; when I mentioned this on the
forum I was chided for impatience and then was told I had been sent an
email. Checked entire email server logs, nothing there. I asked if it
could be resent and have had no reply.

Friends, it's not looking good.
--
motk
motk
2024-04-03 00:10:20 UTC
Permalink
Post by motk
I didn't receive an email for seven months; when I mentioned this on the
forum I was chided for impatience and then was told I had been sent an
email. Checked entire email server logs, nothing there. I asked if it
could be resent and have had no reply.
And ... it just appeared! Well, there's a thing.
--
motk
motk
2024-04-03 00:20:47 UTC
Permalink
Post by motk
Post by motk
I didn't receive an email for seven months; when I mentioned this on
the forum I was chided for impatience and then was told I had been
sent an email. Checked entire email server logs, nothing there. I
asked if it could be resent and have had no reply.
And ... it just appeared! Well, there's a thing.
Doing a qm import into proxmox now. I've been a bit intemperant, and I'm
not convinced in any way that this is the right thing to do with the CL,
but I'll give it a go.
--
motk
motk
2024-04-04 07:52:48 UTC
Permalink
Post by motk
Post by motk
Post by motk
I didn't receive an email for seven months; when I mentioned this on
the forum I was chided for impatience and then was told I had been
sent an email. Checked entire email server logs, nothing there. I
asked if it could be resent and have had no reply.
And ... it just appeared! Well, there's a thing.
Doing a qm import into proxmox now. I've been a bit intemperant, and I'm
not convinced in any way that this is the right thing to do with the CL,
but I'll give it a go.
After working out I needed to add "args: -no-hpet" to the server conf in
/etc/pve/qemu-server/xxx.conf, it's been working well enough.

I did apparently accidentally fuzz things:

$ product list *

Improperly handled condition, bad stack or no handler specified.
Signal arguments: Number = 0000000000000005
Name = 000000000000000C
0000000000000007
0000000000006000
FFFF830007C0236B
0000000000000012
Register dump:
RAX = 0000000000000000 RDI = 000000007FF9DC80 RSI = 0000000000006000
RDX = 5344524F5759454B RCX = 0000000000006000 R8 = 00000000FFFF8F86
R9 = 000000000808080D RBX = 000000007FFABE00 RBP = 000000007FF9FF60
R10 = 000000007FFABDB0 R11 = 000000007FFA4D18 R12 = 000000007FF9C0F8
R13 = 0000000000000018 R14 = 000000007FF9C2B0 R15 = 0000000000008301
RIP = FFFF830007C0236B RSP = 000000007FF9FF00 SS = 000000000000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=000000000000000C, PC=0000000000000002, PS=7AD5D8EE

Improperly handled condition, bad stack or no handler specified.
Signal arguments: Number = 0000000000000005
Name = 000000000000000C
0000000000000014
0000000000000018
000000007FF9CE98
0000000000000012
Register dump:
RAX = 0000000000000001 RDI = 000000007FF7BC20 RSI = 0000000000000001
RDX = 0000000000000000 RCX = FFFFFFFF8AC09B5E R8 = 000000007ACBD11F
R9 = 0000000000000106 RBX = 000000007FFABE00 RBP = 000000007FF9F0A8
R10 = 000000007FFA4D18 R11 = 000000007FFA4D18 R12 = 000000007FF9F060
R13 = 000000007FFCDCAC R14 = 0000000000000002 R15 = 000000003DA78301
Connection closed by foreign host.000000007FF9CA30 SS = 000000000000001B
Volker Halle
2024-04-04 08:49:15 UTC
Permalink
...
$ product list  *
  Improperly handled condition, bad stack or no handler specified.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000007
                                 0000000000006000
                                 FFFF830007C0236B
                                 0000000000000012
    RAX = 0000000000000000  RDI = 000000007FF9DC80  RSI = 0000000000006000
    RDX = 5344524F5759454B  RCX = 0000000000006000  R8  = 00000000FFFF8F86
    R9  = 000000000808080D  RBX = 000000007FFABE00  RBP = 000000007FF9FF60
    R10 = 000000007FFABDB0  R11 = 000000007FFA4D18  R12 = 000000007FF9C0F8
    R13 = 0000000000000018  R14 = 000000007FF9C2B0  R15 = 0000000000008301
    RIP = FFFF830007C0236B  RSP = 000000007FF9FF00  SS  = 000000000000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=000000000000000C, PC=0000000000000002, PS=7AD5D8EE
motk,

can you reproduce this ? Over and over again ?

If so, please consider to report this in the VSI Forum together with a
detailled description of your underlying software/hardware and the steps
to reproduce this access violation.

https://forum.vmssoftware.com/search.php?search_id=active_topics

Volker.
motk
2024-04-04 09:59:55 UTC
Permalink
Post by Volker Halle
can you reproduce this ? Over and over again ?
If so, please consider to report this in the VSI Forum together with a
detailled description of your underlying software/hardware and the steps
to reproduce this access violation.
https://forum.vmssoftware.com/search.php?search_id=active_topics
If I can reproduce it, I will. No luck so far.
--
motk
Simon Clubley
2024-04-04 12:43:09 UTC
Permalink
Post by Volker Halle
...
$ product list  *
  Improperly handled condition, bad stack or no handler specified.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000007
                                 0000000000006000
                                 FFFF830007C0236B
                                 0000000000000012
    RAX = 0000000000000000  RDI = 000000007FF9DC80  RSI = 0000000000006000
    RDX = 5344524F5759454B  RCX = 0000000000006000  R8  = 00000000FFFF8F86
    R9  = 000000000808080D  RBX = 000000007FFABE00  RBP = 000000007FF9FF60
    R10 = 000000007FFABDB0  R11 = 000000007FFA4D18  R12 = 000000007FF9C0F8
    R13 = 0000000000000018  R14 = 000000007FF9C2B0  R15 = 0000000000008301
    RIP = FFFF830007C0236B  RSP = 000000007FF9FF00  SS  = 000000000000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=000000000000000C, PC=0000000000000002, PS=7AD5D8EE
motk,
can you reproduce this ? Over and over again ?
If so, please consider to report this in the VSI Forum together with a
detailled description of your underlying software/hardware and the steps
to reproduce this access violation.
https://forum.vmssoftware.com/search.php?search_id=active_topics
Well that's a seriously "interesting" thing to suggest Volker. :-(

Assuming the posted output has not been edited, running a user-mode program
resulted in the process being killed. That means it failed in either
supervisor mode or executive mode. That means VSI need to be told _privately_
about the sequence to reproduce (if it can be discovered) so that they can
see if the sequence can be modified to actually exploit the system.

If that sequence is published in a public forum before VSI have analyzed
and fixed the problem, it means anyone else will be able to perform the
same analysis to see if the problem can be exploited. :-(

Of course, the OP will have to report this to VSI via insecure email
(if they can discover the sequence) because VSI clearly think it is
below them to actually make available a public security reporting
mechanism. :-(

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
motk
2024-04-05 09:49:04 UTC
Permalink
Post by Simon Clubley
Post by Volker Halle
...
$ product list  *
  Improperly handled condition, bad stack or no handler specified.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000007
                                 0000000000006000
                                 FFFF830007C0236B
                                 0000000000000012
    RAX = 0000000000000000  RDI = 000000007FF9DC80  RSI = 0000000000006000
    RDX = 5344524F5759454B  RCX = 0000000000006000  R8  = 00000000FFFF8F86
    R9  = 000000000808080D  RBX = 000000007FFABE00  RBP = 000000007FF9FF60
    R10 = 000000007FFABDB0  R11 = 000000007FFA4D18  R12 = 000000007FF9C0F8
    R13 = 0000000000000018  R14 = 000000007FF9C2B0  R15 = 0000000000008301
    RIP = FFFF830007C0236B  RSP = 000000007FF9FF00  SS  = 000000000000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=000000000000000C, PC=0000000000000002, PS=7AD5D8EE
motk,
can you reproduce this ? Over and over again ?
If so, please consider to report this in the VSI Forum together with a
detailled description of your underlying software/hardware and the steps
to reproduce this access violation.
https://forum.vmssoftware.com/search.php?search_id=active_topics
Well that's a seriously "interesting" thing to suggest Volker. :-(
Assuming the posted output has not been edited, running a user-mode program
resulted in the process being killed. That means it failed in either
supervisor mode or executive mode. That means VSI need to be told _privately_
about the sequence to reproduce (if it can be discovered) so that they can
see if the sequence can be modified to actually exploit the system.
I can guarantee that output wasn't mangled in any way. I've been trying
to reproduce it with no success so far. It was an extremely surprising
result though; would it have hit a log somewhere?
Post by Simon Clubley
If that sequence is published in a public forum before VSI have analyzed
and fixed the problem, it means anyone else will be able to perform the
same analysis to see if the problem can be exploited. :-(
Fuzzing tools are free and plentiful.
Post by Simon Clubley
Simon.
--
motk
Simon Clubley
2024-04-05 12:27:31 UTC
Permalink
Post by motk
Post by Simon Clubley
Post by motk
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=000000000000000C, PC=0000000000000002, PS=7AD5D8EE
Assuming the posted output has not been edited, running a user-mode program
resulted in the process being killed. That means it failed in either
supervisor mode or executive mode. That means VSI need to be told _privately_
about the sequence to reproduce (if it can be discovered) so that they can
see if the sequence can be modified to actually exploit the system.
I can guarantee that output wasn't mangled in any way. I've been trying
to reproduce it with no success so far. It was an extremely surprising
result though; would it have hit a log somewhere?
The exit status on the accounting log may be interesting.

Try a "$ acc/since=<whatever>/full" and once you have found the correct
session, have a look at the exit status.

Assuming the current mode bits on x86-64 in the PS register match those
on Alpha, then, assuming I am decoding the bitfield correctly, it looks
like an executive mode failure (ie: a failure in RMS itself). :-(

Can someone confirm I am decoding the current mode bits in the above
PS register correctly ?

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Simon Clubley
2024-04-05 13:20:16 UTC
Permalink
Post by Simon Clubley
Post by motk
Post by Simon Clubley
Post by motk
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=000000000000000C, PC=0000000000000002, PS=7AD5D8EE
Assuming the posted output has not been edited, running a user-mode program
resulted in the process being killed. That means it failed in either
supervisor mode or executive mode. That means VSI need to be told _privately_
about the sequence to reproduce (if it can be discovered) so that they can
see if the sequence can be modified to actually exploit the system.
I can guarantee that output wasn't mangled in any way. I've been trying
to reproduce it with no success so far. It was an extremely surprising
result though; would it have hit a log somewhere?
The exit status on the accounting log may be interesting.
Try a "$ acc/since=<whatever>/full" and once you have found the correct
session, have a look at the exit status.
Assuming the current mode bits on x86-64 in the PS register match those
on Alpha, then, assuming I am decoding the bitfield correctly, it looks
like an executive mode failure (ie: a failure in RMS itself). :-(
Can someone confirm I am decoding the current mode bits in the above
PS register correctly ?
Another idea: If this really is an executive mode failure, I wonder if
setting BUGCHECKFATAL to 1 would be useful here ?

Alex: What this would do is to turn the failure into a failure that
crashes the system (and writes a dumpfile, assuming the VSI virtual
image is setup correctly) instead of just deleting the current process.

It would also mean anything in memory (including command history, etc)
would be written to the dumpfile, so make sure there's nothing private
in the memory of your system before performing more tests.

What you could do then is to compress the dumpfile and send it to VSI
privately via some means they give you.

BTW, another idea: does x86-64 VMS currently write an entry into the
errorlog on an executive mode bugcheck ? I wonder if it would be useful
to check the errorlog to see if there is anything useful there from the
previous failure ?

Simon.

PS: pointed message to VSI management: A hobbyist has just appeared to
find a process-deleting bug within VMS that your testing has missed so far.
_This_ is an example of why the hobbyist program is important, and of
benefit, to you.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Simon Clubley
2024-04-05 13:22:22 UTC
Permalink
Post by Simon Clubley
Alex: What this would do is to turn the failure into a failure that
crashes the system (and writes a dumpfile, assuming the VSI virtual
image is setup correctly) instead of just deleting the current process.
Oops. motk, not Alex. Sorry. :-)

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Single Stage to Orbit
2024-04-05 15:48:10 UTC
Permalink
Post by Simon Clubley
Alex: What this would do is to turn the failure into a failure that
crashes the system (and writes a dumpfile, assuming the VSI virtual
image is setup correctly) instead of just deleting the current
process.
Who, me? I think it's actually motk with the issue, is it not?
--
Tactical Nuclear Kittens
Simon Clubley
2024-04-05 17:15:06 UTC
Permalink
Post by Single Stage to Orbit
Post by Simon Clubley
Alex: What this would do is to turn the failure into a failure that
crashes the system (and writes a dumpfile, assuming the VSI virtual
image is setup correctly) instead of just deleting the current process.
Who, me? I think it's actually motk with the issue, is it not?
Yes, you. :-) As I posted in a followup, I got the name wrong. Sorry. :-)

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
motk
2024-04-06 09:39:20 UTC
Permalink
Post by Simon Clubley
Another idea: If this really is an executive mode failure, I wonder if
setting BUGCHECKFATAL to 1 would be useful here ?
Alex: What this would do is to turn the failure into a failure that
crashes the system (and writes a dumpfile, assuming the VSI virtual
image is setup correctly) instead of just deleting the current process.
It would also mean anything in memory (including command history, etc)
would be written to the dumpfile, so make sure there's nothing private
in the memory of your system before performing more tests.
What you could do then is to compress the dumpfile and send it to VSI
privately via some means they give you.
BTW, another idea: does x86-64 VMS currently write an entry into the
errorlog on an executive mode bugcheck ? I wonder if it would be useful
to check the errorlog to see if there is anything useful there from the
previous failure ?
Found some time today to try installing Apache, trying random
half-remembered commands and looking through documentation that just
talks about AXP and Integrity stuff, and lo:

$ show proc

Improperly handled condition, bad stack or no handler specified.
Signal arguments: Number = 0000000000000005
Name = 000000000000000C
0000000000000004
0000000000000878
FFFF830007C02367
0000000000000012
Register dump:
RAX = 000000007FF9DC80 RDI = 000000007FF9DC80 RSI = 0000000000000878
RDX = 0000000000000000 RCX = 0000000000000878 R8 = 00000000FFFF8F81
R9 = 000000000808080D RBX = 000000007FFABE00 RBP = 000000007FF9E540
R10 = 000000007FFABDB0 R11 = 000000007FFA4D18 R12 = 000000007FF9C648
R13 = 0000000000000018 R14 = 000000007FF9C800 R15 = 0000000000000201
RIP = FFFF830007C02367 RSP = 000000007FF9E4E0 SS = 000000000000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=000000000000
000C, PC=0000000000000002, PS=7AD82CC6

Improperly handled condition, bad stack or no handler specified.
Signal arguments: Number = 0000000000000005
Name = 000000000000000C
0000000000000000
8000000000000000
000000007AE15900
0000000000000012
Register dump:
RAX = 0000000000000001 RDI = FFFFFFFF77761C88 RSI = 0000000002040001
RDX = 0000000000000000 RCX = FFFFFFFF8AC09B5E R8 = 000000007ACBD11F
R9 = 0000000000000106 RBX = 000000007FFABE00 RBP = 000000007FF9D608
R10 = 000000007FFA4D18 R11 = 000000007FFA4D18 R12 = 000000007FF9D5C0
R13 = 000000007FFCDCAC R14 = 0000000000000002 R15 = 0000000045178301
Connection closed by foreign host.000000007FF9CA28 SS = 000000000000001B

Now I'll try and work out where dumpfiles go - I did turn on
SYSTEM_CHECK and write it to CURRENT so hopefully it's there somewhere.
Post by Simon Clubley
Simon.
PS: pointed message to VSI management: A hobbyist has just appeared to
find a process-deleting bug within VMS that your testing has missed so far.
_This_ is an example of why the hobbyist program is important, and of
benefit, to you.
Yeah, noobs do non-obvious stuff and find weird edge cases.
--
motk
motk
2024-04-06 09:50:54 UTC
Permalink
Post by motk
Now I'll try and work out where dumpfiles go - I did turn on
SYSTEM_CHECK and write it to CURRENT so hopefully it's there somewhere.
Yeah, nah, nothing in SYS$SYSTEM:SYSDUMP.DMP.
--
motk
Simon Clubley
2024-04-08 12:22:40 UTC
Permalink
Post by motk
Post by motk
Now I'll try and work out where dumpfiles go - I did turn on
SYSTEM_CHECK and write it to CURRENT so hopefully it's there somewhere.
Yeah, nah, nothing in SYS$SYSTEM:SYSDUMP.DMP.
It could be in the pagefile (or VSI may have disabled dumps on this
image they are now shipping.)

Here is a writeup from the VSI Wiki you should find useful and which
will guide you through the various possibilities:

https://wiki.vmssoftware.com/Dump_File

BTW, I am surprised by the lack of answers to this question over the
weekend before I got to see it just now. In the old days, there would
have been a lot of answers to your question (and to my other technical
questions) by now. People still complain about the off-topic stuff, but
when there's a series of good technical questions, there are no longer
any answers to them.

I wonder if the VMS online community has finally collapsed and there
are only a few people left.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
motk
2024-04-08 01:36:26 UTC
Permalink
On 4/6/24 19:50, motk wrote:

Everything is going great.

$ help analyze

Improperly handled condition, bad stack or no handler specified.
Signal arguments: Number = 0000000000000005
Name = 000000000000000C
0000000000000004
00000000000007D8
FFFF830007C02367
0000000000000012
Register dump:
RAX = 000000007FF9DC80 RDI = 000000007FF9DC80 RSI = 00000000000007D8
RDX = 0000000000000000 RCX = 00000000000007D8 R8 = 00000000FFFF8F84
R9 = 000000000808080D RBX = 000000007FFABE00 RBP = 000000007FF9E4A0
R10 = 000000007FFABDB0 R11 = 000000007FFA4D18 R12 = 000000007FF9C648
R13 = 0000000000000018 R14 = 000000007FF9C800 R15 = 0000000000008301
RIP = FFFF830007C02367 RSP = 000000007FF9E440 SS = 000000000000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=000000000000000C, PC=0000000000000002, PS=7AD44D2F

Improperly handled condition, bad stack or no handler specified.
Signal arguments: Number = 0000000000000005
Name = 000000000000000C
0000000000000014
0000000000000000
0000000000000000
0000000000000012
Register dump:
RAX = 0000000000000001 RDI = FFFFFFFF776521E8 RSI = 00000000020C0001
RDX = 0000000000000000 RCX = FFFFFFFF8AC09B5E R8 = 000000007ACBD11F
R9 = 0000000004000106 RBX = 000000007FFABE00 RBP = 000000007FF9D568
R10 = 000000007FFA4D18 R11 = 000000007FFA4D18 R12 = 000000007FF9D520
R13 = 000000007FFCDCAC R14 = 0000000000000002 R15 = 000000004B9E8301
Connection to 192.168.188.121 closed.000007FF9CA30 SS = 000000000000001B
--
motk
Arne Vajhøj
2024-04-08 02:20:42 UTC
Permalink
Post by motk
$ help analyze
  Improperly handled condition, bad stack or no handler specified.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000004
                                 00000000000007D8
                                 FFFF830007C02367
                                 0000000000000012
    RAX = 000000007FF9DC80  RDI = 000000007FF9DC80  RSI = 00000000000007D8
    RDX = 0000000000000000  RCX = 00000000000007D8  R8  = 00000000FFFF8F84
    R9  = 000000000808080D  RBX = 000000007FFABE00  RBP = 000000007FF9E4A0
    R10 = 000000007FFABDB0  R11 = 000000007FFA4D18  R12 = 000000007FF9C648
    R13 = 0000000000000018  R14 = 000000007FF9C800  R15 = 0000000000008301
    RIP = FFFF830007C02367  RSP = 000000007FF9E440  SS  = 000000000000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=000000000000000C, PC=0000000000000002, PS=7AD44D2F
  Improperly handled condition, bad stack or no handler specified.
    Signal arguments:   Number = 0000000000000005
                        Name   = 000000000000000C
                                 0000000000000014
                                 0000000000000000
                                 0000000000000000
                                 0000000000000012
    RAX = 0000000000000001  RDI = FFFFFFFF776521E8  RSI = 00000000020C0001
    RDX = 0000000000000000  RCX = FFFFFFFF8AC09B5E  R8  = 000000007ACBD11F
    R9  = 0000000004000106  RBX = 000000007FFABE00  RBP = 000000007FF9D568
    R10 = 000000007FFA4D18  R11 = 000000007FFA4D18  R12 = 000000007FF9D520
    R13 = 000000007FFCDCAC  R14 = 0000000000000002  R15 = 000000004B9E8301
Connection to 192.168.188.121 closed.000007FF9CA30  SS  = 000000000000001B
I am getting the idea that VMS does not like your VM.

It seems unlikely to me that:

$ product list *
$ help analyze

trigger some x86-64 specific DCL or RMS bug that nobody else has
encountered.

I consider it more likely that VMS and the CPU/virtual memory
environment your VM provide disagree on something causing
random sporadic memory related errors.

But then this stuff is way outside of my expertise area
so I am probably wrong.

Arne
motk
2024-04-08 02:30:33 UTC
Permalink
Post by Arne Vajhøj
I consider it more likely that VMS and the CPU/virtual memory
environment your VM provide disagree on something causing
random sporadic memory related errors.
It seems odd, I agree. This was running on an intel nuc with a 12th gen
i5 cpu, and I wonder if openvms doesn't like straddling P/E cores.

I've previously done some memory burn-in on that node without issues.

I've migrated it over to a plain 6th gen node with 4 boring cores; lets
see if that improves things.
Post by Arne Vajhøj
Arne
--
motk
Craig A. Berry
2024-04-08 12:18:40 UTC
Permalink
Post by motk
Post by Arne Vajhøj
I consider it more likely that VMS and the CPU/virtual memory
environment your VM provide disagree on something causing
random sporadic memory related errors.
It seems odd, I agree. This was running on an intel nuc with a 12th gen
i5 cpu, and I wonder if openvms doesn't like straddling P/E cores.
I've previously done some memory burn-in on that node without issues.
I've migrated it over to a plain 6th gen node with 4 boring cores; lets
see if that improves things.
I don't think 6th gen is supported is it? In any case, check your
host(s) with the Python script here:

https://vmssoftware.com/openkits/alpopensource/vmscheck.zip
motk
2024-04-09 00:05:59 UTC
Permalink
I don't think 6th gen is supported is it?  In any case, check your
Depends on what you mean by 'supported', unless there's some very odd
microcode or msr setup that needs to be done so long as it meets cpu
criteria, it should Just Work. We're in an age of commodity COTS
hardware now.
https://vmssoftware.com/openkits/alpopensource/vmscheck.zip
All my hardware checks out using the vmscheck.py script.
--
motk
Lawrence D'Oliveiro
2024-04-09 00:18:10 UTC
Permalink
We're in an age of commodity COTS hardware now.
Remember that x86 is not exactly an “open standard”--it is very much
controlled by proprietary vendors who never cease their search for that
vendor-lock-in edge.

Just because you have been pampered by the ongoing efforts of OS
developers in the Linux community and elsewhere to ensure that things
“just work” doesn’t mean it’s something you can take for granted.
motk
2024-04-09 00:31:16 UTC
Permalink
Post by Lawrence D'Oliveiro
Remember that x86 is not exactly an “open standard”--it is very much
controlled by proprietary vendors who never cease their search for that
vendor-lock-in edge.
Just because you have been pampered by the ongoing efforts of OS
developers in the Linux community and elsewhere to ensure that things
“just work” doesn’t mean it’s something you can take for granted.
Yeah, good point. That said, even if it's not an open standard with a
fixed system architecture, you have things like the x86-64 cpu
micro-arch feature levels to work with, eg
https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html. Making the
supported versions narrow enough to not burden development but not so
narrow that it discourages use and therefore discovering interesting
bugs isn't an easy job.
--
motk
Simon Clubley
2024-04-08 12:34:44 UTC
Permalink
Post by motk
Post by Arne Vajhøj
I consider it more likely that VMS and the CPU/virtual memory
environment your VM provide disagree on something causing
random sporadic memory related errors.
It seems odd, I agree. This was running on an intel nuc with a 12th gen
i5 cpu, and I wonder if openvms doesn't like straddling P/E cores.
I've previously done some memory burn-in on that node without issues.
I've migrated it over to a plain 6th gen node with 4 boring cores; lets
see if that improves things.
It's still a VMS bug (even if VMS is being _way_ too fragile) IMHO.

IIRC, motk is using proxmox, which many other people are using just fine
to run other operating systems.

If there's something VMS needs or a configuration it doesn't support,
then that should be probed at boot time and VMS should refuse to continue
booting with the reason why been made clear. The bug in this case is that
this check is missing from VMS.

The other possibility is that VMS is _supposed_ to work OK in this
configuration, but this specific VM setup has been untested by VSI until
now. That means there is a bug in the VMS code itself which needs fixing.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Robert A. Brooks
2024-04-08 16:49:23 UTC
Permalink
Post by Simon Clubley
If there's something VMS needs or a configuration it doesn't support,
then that should be probed at boot time and VMS should refuse to continue
booting with the reason why been made clear. The bug in this case is that
this check is missing from VMS.
That's one way to look at it.

Another way is that we have been quite clear what the requirements are to run VMS.

Any variation from that is unsupported. We recognize that there are likely configurations
that are technically unsupported, but will still likely work. Preventing those
configurations from working is someething we could do, but chose not to.
Post by Simon Clubley
The other possibility is that VMS is _supposed_ to work OK in this
configuration, but this specific VM setup has been untested by VSI until
now. That means there is a bug in the VMS code itself which needs fixing.
We are not claiming support for Proxmox, although that testing has begun.
Given that it is a KVM-based hypervisor, getting it fully supported should not
be difficult, but we're not there yet.

It is for these reasons that we've been quite conservative about what is supported.

We are interested in any feedback we get, but that doesn't mean we're going to respond to every
problem immediately when it's an unsupported configuration.
--
-- Rob
Simon Clubley
2024-04-08 17:30:15 UTC
Permalink
Post by Robert A. Brooks
Post by Simon Clubley
If there's something VMS needs or a configuration it doesn't support,
then that should be probed at boot time and VMS should refuse to continue
booting with the reason why been made clear. The bug in this case is that
this check is missing from VMS.
That's one way to look at it.
Another way is that we have been quite clear what the requirements are to run VMS.
Any variation from that is unsupported. We recognize that there are likely configurations
that are technically unsupported, but will still likely work. Preventing those
configurations from working is someething we could do, but chose not to.
Given that the VMS mindset is supposed to be one of robustness and
reliability, perhaps the proper approach is to enforce a default refuse
to boot on unsupposed configuration, but allow an override with a boot
flag or SYSGEN parameter.

That way, people don't accidentally use an unsupported configuration in
production use, but you also don't stop people from using an unsupported
configuration if they choose to do so.

However, if you implement this, an impossible to miss message should be
output on every boot so that the flag is not set and then forgotten about.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Simon Clubley
2024-04-08 17:32:05 UTC
Permalink
Post by Simon Clubley
Post by Robert A. Brooks
Post by Simon Clubley
If there's something VMS needs or a configuration it doesn't support,
then that should be probed at boot time and VMS should refuse to continue
booting with the reason why been made clear. The bug in this case is that
this check is missing from VMS.
That's one way to look at it.
Another way is that we have been quite clear what the requirements are to run VMS.
Any variation from that is unsupported. We recognize that there are likely configurations
that are technically unsupported, but will still likely work. Preventing those
configurations from working is someething we could do, but chose not to.
Given that the VMS mindset is supposed to be one of robustness and
reliability, perhaps the proper approach is to enforce a default refuse
to boot on unsupposed configuration, but allow an override with a boot
s/unsupposed/unsupported/

Oops. :-)
Post by Simon Clubley
flag or SYSGEN parameter.
That way, people don't accidentally use an unsupported configuration in
production use, but you also don't stop people from using an unsupported
configuration if they choose to do so.
However, if you implement this, an impossible to miss message should be
output on every boot so that the flag is not set and then forgotten about.
Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Stephen Hoffman
2024-04-15 18:44:10 UTC
Permalink
Post by Simon Clubley
Given that the VMS mindset is supposed to be one of robustness and
reliability, perhaps the proper approach is to enforce a default refuse
to boot on unsupposed configuration, but allow an override with a boot
flag or SYSGEN parameter.
A fondness for arcana, inappropriate defaults, and documented and
ill-documented knobs—many of these defaults and these knobs tracing
back to the fallout arising from compatibility—seemingly became
commonly-accepted practice decades ago.

Put differently, documenting the potential for or incidents of a limit
or a crash or appropriately-trained and experienced staff has long been
viewed as meeting minimal requirements, either via the SPD or via some
other detail or document elsewhere. One of the bootcamp sessions
covering known issues with a supported feature was just surreal, for
instance. There have been other examples. Expectations and assumptions
can all change, too. I know mine have. But that's all fodder for
another time.
Post by Simon Clubley
That way, people don't accidentally use an unsupported configuration in
production use, but you also don't stop people from using an
unsupported configuration if they choose to do so.
I fixed a whole pile of support calls with diagnostics shown for
unsupported configurations, and fixed even more support calls by
automatically detecting and fixing the most common of the errors.

It's quite correct that unsupported-hardware configurations are
incredibly difficult or ~impossible to detect, but overt
mistakes—configuration detection analogous to that Python script—should
be built in to the OpenVMS console or incorporated into the early
bootstrap. Same for similar detection built into (or shared with) the
installer.

Following another longstanding OpenVMS practice of far too chatty
bootstraps (q.v. inappropriate defaults), adding diagnostics and adding
a hardware configuration display and showing an unsupported
configuration diagnostic as needed there would parallel longstanding
console practices.

I'd probably add the unsupported display into the x86-64 error analysis
tooling or into SHOW CPU or some other diagnostic tooling visible at
run-time, and shown in SDA for crashes. This not to imply error-related
tooling is ever going to be remotely easy to create and maintain on
x86-64.

Architecturally, adding a hardware section into the system or
site-local device configuration database would make sense for this
detection. If somebody wants to mask the unsupported-configuration
error, they can edit their change into the site-local hardware
configuration data. (q.v. arcana)
Post by Simon Clubley
However, if you implement this, an impossible to miss message should be
output on every boot so that the flag is not set and then forgotten about.
Identifying everything unsupported is ~impossible given the constraints
OpenVMS operates under. But I wouldn't be concerned about adding some
diagnostics to an already-chatty bootstrap either, given normal OpenVMS
bootstraps and startups are purpose-designed to make that easy to hide
info and easy to ignore. Hardware, too. Default Itanium startups are
just stupidly chatty.
--
Pure Personal Opinion | HoffmanLabs LLC
Lawrence D'Oliveiro
2024-04-15 22:23:44 UTC
Permalink
Post by Stephen Hoffman
It's quite correct that unsupported-hardware configurations are
incredibly difficult or ~impossible to detect ...
Which is why it would have been so much simpler if VSI had not gone the
“not invented here” route, and built their x86 port on top of an existing
OS, leaving all the tricky hardware stuff to that.

Arne Vajhøj
2024-04-09 01:00:26 UTC
Permalink
Post by Robert A. Brooks
Post by Simon Clubley
If there's something VMS needs or a configuration it doesn't support,
then that should be probed at boot time and VMS should refuse to continue
booting with the reason why been made clear. The bug in this case is that
this check is missing from VMS.
That's one way to look at it.
Another way is that we have been quite clear what the requirements are to run VMS.
Any variation from that is unsupported.  We recognize that there are
likely configurations
that are technically unsupported, but will still likely work.
Preventing those
configurations from working is someething we could do, but chose not to.
Post by Simon Clubley
The other possibility is that VMS is _supposed_ to work OK in this
configuration, but this specific VM setup has been untested by VSI until
now. That means there is a bug in the VMS code itself which needs fixing.
We are not claiming support for Proxmox, although that testing has begun.
Given that it is a KVM-based hypervisor, getting it fully supported should not
be difficult, but we're not there yet.
It is for these reasons that we've been quite conservative about what is supported.
We are interested in any feedback we get, but that doesn't mean we're
going to respond to every
problem immediately when it's an unsupported configuration.
I think the current practice makes sense.

It should be obvious that running an unsupported configuration
comes with a risk of problems.

It makes sense to have VMS complain about configs that are
known not to work.

But I think it would be very problematic with VMS complaining
over configs that are not known to work.

Because removing that test would require a release.

We would see:

...
VMS 9.2-2H41 - added support for VM Foo 17 and VM Bar 3
VMS 9.2-2H42 - added support for VM Bar 4 and VM FooBar 7
..

No thanks.

Arne
Lawrence D'Oliveiro
2024-04-09 01:43:20 UTC
Permalink
Post by Arne Vajhøj
Because removing that test would require a release.
You don’t have the concept of “volatile” package updates? Like the
timezone database, which changes several times a year?
motk
2024-04-09 05:35:07 UTC
Permalink
Post by Lawrence D'Oliveiro
You don’t have the concept of “volatile” package updates? Like the
timezone database, which changes several times a year?
It sounds like it's going to be a yearly build and then thrown over the
fence? Surely not.
--
motk
Dave Froble
2024-04-10 00:48:01 UTC
Permalink
Post by Lawrence D'Oliveiro
You don’t have the concept of “volatile” package updates? Like the
timezone database, which changes several times a year?
It sounds like it's going to be a yearly build and then thrown over the fence?
Surely not.
In the interest of disapproving the use of assumptions (ASS U ME), I'd guess
nobody yet knows what is going to happen. Perhaps even VSI!
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
motk
2024-04-10 01:55:09 UTC
Permalink
Post by Dave Froble
In the interest of disapproving the use of assumptions (ASS U ME), I'd
guess nobody yet knows what is going to happen.  Perhaps even VSI!
If the latter, that'd be a worry! Hard to tell though, not a lot of
communication happening.
--
motk
Simon Clubley
2024-04-09 12:45:44 UTC
Permalink
Post by Arne Vajhøj
But I think it would be very problematic with VMS complaining
over configs that are not known to work.
Because removing that test would require a release.
...
VMS 9.2-2H41 - added support for VM Foo 17 and VM Bar 3
VMS 9.2-2H42 - added support for VM Bar 4 and VM FooBar 7
..
No thanks.
It does not have to be a release - it could be a patch. It is also
absolutely no different from in the past when a new version of VMS
used to add support for new CPUs from DEC.

IOW, my suggested approach is a very long-established part of the
VMS world. The only difference now is that VMS would be allowed to
continue booting if you set an override flag or SYSGEN parameter.

Also, there should be no need to add support for "VM Bar 4" unless
it brought new functionality over "VM Bar 3" that you wanted to
support in VMS.

VMS is used in mission-critical production environments. You should
not be allowed to accidentally boot into an unsupported configuration
without being made _VERY_ aware of that fact.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Arne Vajhøj
2024-04-09 19:23:33 UTC
Permalink
Post by Simon Clubley
Post by Arne Vajhøj
But I think it would be very problematic with VMS complaining
over configs that are not known to work.
Because removing that test would require a release.
...
VMS 9.2-2H41 - added support for VM Foo 17 and VM Bar 3
VMS 9.2-2H42 - added support for VM Bar 4 and VM FooBar 7
..
No thanks.
It does not have to be a release - it could be a patch.
True.

But I don't like:
...
VMS 9.2-2 with HW patch 41
VMS 9.2-2 with HW patch 42
...

either.
Post by Simon Clubley
It is also
absolutely no different from in the past when a new version of VMS
used to add support for new CPUs from DEC.
Correct. This is the mechanism used in the past.

But the context has changed. Instead of DEC releasing a few new
models every year, then we have an huge number of VM, VM version,
host, host version combos.

H1..H4 is fine. H1..H100 is a problem.
Post by Simon Clubley
IOW, my suggested approach is a very long-established part of the
VMS world. The only difference now is that VMS would be allowed to
continue booting if you set an override flag or SYSGEN parameter.
Also, there should be no need to add support for "VM Bar 4" unless
it brought new functionality over "VM Bar 3" that you wanted to
support in VMS.
????

The interest in different VM's is not driven by what VMS need,
but from what customers want.
Post by Simon Clubley
VMS is used in mission-critical production environments. You should
not be allowed to accidentally boot into an unsupported configuration
without being made _VERY_ aware of that fact.
Hopefully those running a mission critical production environment
on VMS read about supported configs before moving production to
that config and never runs it in anything accidentally
booted.

Arne
Simon Clubley
2024-04-10 12:10:53 UTC
Permalink
Post by Arne Vajhøj
Post by Simon Clubley
Post by Arne Vajhøj
But I think it would be very problematic with VMS complaining
over configs that are not known to work.
Because removing that test would require a release.
...
VMS 9.2-2H41 - added support for VM Foo 17 and VM Bar 3
VMS 9.2-2H42 - added support for VM Bar 4 and VM FooBar 7
..
No thanks.
It does not have to be a release - it could be a patch.
True.
...
VMS 9.2-2 with HW patch 41
VMS 9.2-2 with HW patch 42
...
either.
How many VM solutions do you think there are out there ? :-)

Hint: there isn't 41 of them. :-)
Post by Arne Vajhøj
Post by Simon Clubley
IOW, my suggested approach is a very long-established part of the
VMS world. The only difference now is that VMS would be allowed to
continue booting if you set an override flag or SYSGEN parameter.
Also, there should be no need to add support for "VM Bar 4" unless
it brought new functionality over "VM Bar 3" that you wanted to
support in VMS.
????
The interest in different VM's is not driven by what VMS need,
but from what customers want.
What customers need is implemented by turning it into what VMS needs...
Post by Arne Vajhøj
Post by Simon Clubley
VMS is used in mission-critical production environments. You should
not be allowed to accidentally boot into an unsupported configuration
without being made _VERY_ aware of that fact.
Hopefully those running a mission critical production environment
on VMS read about supported configs before moving production to
that config and never runs it in anything accidentally
booted.
According to some people: "There is no need for anything more safer than
the C or C++ programming language. You just have to be careful when writing
your code...". Your comment above is from the same incorrect mindset.

In the real world, people make mistakes, especially in an outsourced
environment where people cost, not people capability, is the driving
factor and hence people are not as skilled with VMS as they could be.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Arne Vajhøj
2024-04-10 13:45:45 UTC
Permalink
Post by Simon Clubley
Post by Arne Vajhøj
Post by Simon Clubley
Post by Arne Vajhøj
But I think it would be very problematic with VMS complaining
over configs that are not known to work.
Because removing that test would require a release.
...
VMS 9.2-2H41 - added support for VM Foo 17 and VM Bar 3
VMS 9.2-2H42 - added support for VM Bar 4 and VM FooBar 7
..
No thanks.
It does not have to be a release - it could be a patch.
True.
...
VMS 9.2-2 with HW patch 41
VMS 9.2-2 with HW patch 42
...
either.
How many VM solutions do you think there are out there ? :-)
Hint: there isn't 41 of them. :-)
Considering versions - yes there are.

And adding host and versions of that we will probably pass 410.
Post by Simon Clubley
Post by Arne Vajhøj
Post by Simon Clubley
VMS is used in mission-critical production environments. You should
not be allowed to accidentally boot into an unsupported configuration
without being made _VERY_ aware of that fact.
Hopefully those running a mission critical production environment
on VMS read about supported configs before moving production to
that config and never runs it in anything accidentally
booted.
According to some people: "There is no need for anything more safer than
the C or C++ programming language. You just have to be careful when writing
your code...". Your comment above is from the same incorrect mindset.
In the real world, people make mistakes, especially in an outsourced
environment where people cost, not people capability, is the driving
factor and hence people are not as skilled with VMS as they could be.
People makes mistakes. Especially when it is easy to to make
mistakes.

Bringing the mission critical production VM down.
Accidentally install an unsupported VM in production environment.
Accidentally copy the production environment from the supported
VM to the unsupported VM. Accidentally bring up on the unsupported VM.
Is not a mistake, but a series of mistakes. It is not a seconds/minutes
mess up, but hours/days mess up.

It is good to protect against mistakes, but at some point one need
to stop.

What is the plan to prevent people with privs from by mistake to do:

$ DEL SYS$COMMON:[000000...]*.*;*

?

There is no plan. And I don't think we need a plan for that.

I don't think these weird examples are equivalent to people
messing up array indexes in languages not checking those.

These weird examples are the equivalent of the language
enabling checks by default and allowing developers to
bypass checks by putting an unsafe {} block around it and
then having the developer mistakenly put unsafe {}
around some code where it should not have been.

Arne
Dave Froble
2024-04-10 14:15:09 UTC
Permalink
Post by Arne Vajhøj
$ DEL SYS$COMMON:[000000...]*.*;*
Well, since similar has already happened ...

$ De [*...]*.*;*

:-)

I'd guess nothing would stop that.

And it didn't ...
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2024-04-10 17:21:11 UTC
Permalink
Post by Arne Vajhøj
Post by Simon Clubley
How many VM solutions do you think there are out there ? :-)
Hint: there isn't 41 of them. :-)
Considering versions - yes there are.
And adding host and versions of that we will probably pass 410.
Why would you need a new version of VMS whenever a new point release
of the host VM software is released ? The whole point of a VM is to
isolate the OS running under the VM from the underlying hardware/OS.

You test that the VM software is compatible with VMS and then assume
that all future point releases are also compatible. That's the whole
point of a VM.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Dave Froble
2024-04-10 00:51:36 UTC
Permalink
Post by Simon Clubley
Post by Arne Vajhøj
But I think it would be very problematic with VMS complaining
over configs that are not known to work.
Because removing that test would require a release.
...
VMS 9.2-2H41 - added support for VM Foo 17 and VM Bar 3
VMS 9.2-2H42 - added support for VM Bar 4 and VM FooBar 7
..
No thanks.
It does not have to be a release - it could be a patch. It is also
absolutely no different from in the past when a new version of VMS
used to add support for new CPUs from DEC.
IOW, my suggested approach is a very long-established part of the
VMS world. The only difference now is that VMS would be allowed to
continue booting if you set an override flag or SYSGEN parameter.
Also, there should be no need to add support for "VM Bar 4" unless
it brought new functionality over "VM Bar 3" that you wanted to
support in VMS.
VMS is used in mission-critical production environments. You should
not be allowed to accidentally boot into an unsupported configuration
without being made _VERY_ aware of that fact.
Simon.
But hasn't the discussion been about the CL stuff? I don't think CL and mission
critical co-exist. I'm sure VSI doesn't think that.

As for due diligence, when did that go away? Any reasonable customer would
check, and re-check, that they are using supported stuff.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2024-04-10 12:12:32 UTC
Permalink
Post by Dave Froble
But hasn't the discussion been about the CL stuff? I don't think CL and mission
critical co-exist. I'm sure VSI doesn't think that.
No. This is about adding checks to VMS itself.
Post by Dave Froble
As for due diligence, when did that go away? Any reasonable customer would
check, and re-check, that they are using supported stuff.
People make mistakes. See my reply to Arne.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Dave Froble
2024-04-10 14:17:15 UTC
Permalink
Post by Simon Clubley
Post by Dave Froble
But hasn't the discussion been about the CL stuff? I don't think CL and mission
critical co-exist. I'm sure VSI doesn't think that.
No. This is about adding checks to VMS itself.
Post by Dave Froble
As for due diligence, when did that go away? Any reasonable customer would
check, and re-check, that they are using supported stuff.
People make mistakes. See my reply to Arne.
And they pay for them, and hopefully learn from them.

You cannot protect from everything.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
motk
2024-04-10 02:28:32 UTC
Permalink
We would see:>
...
VMS 9.2-2H41 - added support for VM Foo 17 and VM Bar 3
VMS 9.2-2H42 - added support for VM Bar 4 and VM FooBar 7
Just looking at the supported hardware/virtualization environments on
https://vmssoftware.com/about/v92/ and noting a couple of things:

There's no actual 'supported' list, only 'tested'.
The latest tested QEMU version is 5.2.0, from 2020.
The list of 'tested' virt environments is pretty baroque and I wonder
how it's tested.
No thanks.
In practical terms, you're already there. That's the reality of COTS
visualization. You can probably make a good case for requiring proper
ECC RAM and so on so long as you don't handwave away bug reports.
Arne
--
motk
Simon Clubley
2024-04-10 12:21:47 UTC
Permalink
Post by motk
We would see:>
...
VMS 9.2-2H41 - added support for VM Foo 17 and VM Bar 3
VMS 9.2-2H42 - added support for VM Bar 4 and VM FooBar 7
Just looking at the supported hardware/virtualization environments on
There's no actual 'supported' list, only 'tested'.
The latest tested QEMU version is 5.2.0, from 2020.
The list of 'tested' virt environments is pretty baroque and I wonder
how it's tested.
Interesting. What is worrying from that page is that a recent version
of some VMware products causes VMS to fail during boot. I would really
like to know why that is as this is not something I would have expected
to see.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
motk
2024-04-09 05:31:36 UTC
Permalink
Post by Robert A. Brooks
We are not claiming support for Proxmox, although that testing has begun.
Given that it is a KVM-based hypervisor, getting it fully supported should not
be difficult, but we're not there yet.
It's basically vanilla kvm-qemu. People aren't trying to run it in
nested emulators on an FPGA or anything.
--
motk
Simon Clubley
2024-04-09 12:47:37 UTC
Permalink
Post by motk
Post by Robert A. Brooks
We are not claiming support for Proxmox, although that testing has begun.
Given that it is a KVM-based hypervisor, getting it fully supported should not
be difficult, but we're not there yet.
It's basically vanilla kvm-qemu. People aren't trying to run it in
nested emulators on an FPGA or anything.
That's what makes the dramatic failure you are seeing even more surprising.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Hans Bachner
2024-04-11 10:39:49 UTC
Permalink
Post by Robert A. Brooks
Post by Simon Clubley
If there's something VMS needs or a configuration it doesn't support,
then that should be probed at boot time and VMS should refuse to continue
booting with the reason why been made clear. The bug in this case is that
this check is missing from VMS.
That's one way to look at it.
Another way is that we have been quite clear what the requirements are to run VMS.
Any variation from that is unsupported.  We recognize that there are
likely configurations
that are technically unsupported, but will still likely work.
Preventing those
configurations from working is someething we could do, but chose not to.
Post by Simon Clubley
The other possibility is that VMS is _supposed_ to work OK in this
configuration, but this specific VM setup has been untested by VSI until
now. That means there is a bug in the VMS code itself which needs fixing.
We are not claiming support for Proxmox, although that testing has begun.
Given that it is a KVM-based hypervisor, getting it fully supported should not
be difficult, but we're not there yet.
It is for these reasons that we've been quite conservative about what is supported.
We are interested in any feedback we get, but that doesn't mean we're
going to respond to every
problem immediately when it's an unsupported configuration.
At the Connect IT Symposium, Thilo Lauer (VSI) yesterday gave an
introduction to Proxmox and demoed an OpenVMS instance running under
Proxmox. Camiel Vanderhoeven also mentioned that official support for
Proxmox as a hypervisor is relatively high on their list, especially due
to the changes at VMware since the takeover by Broadcom.

So - stay tuned...

Hans.

PS: Camiel has been appointed "Chief Architect and Strategist" at VSI
recently. In his new role, he will relief Clair Grant from some of his
responsibilities.
Arne Vajhøj
2024-04-11 18:57:13 UTC
Permalink
Post by Hans Bachner
PS: Camiel has been appointed "Chief Architect and Strategist" at VSI
recently. In his new role, he will relief Clair Grant from some of his
responsibilities.
So he will be the key person for the discussion about what
happen to VMS *after* the x86-64 migration.

Arne
Dan Cross
2024-04-12 02:28:14 UTC
Permalink
Post by Hans Bachner
Post by Robert A. Brooks
Post by Simon Clubley
If there's something VMS needs or a configuration it doesn't support,
then that should be probed at boot time and VMS should refuse to continue
booting with the reason why been made clear. The bug in this case is that
this check is missing from VMS.
That's one way to look at it.
Another way is that we have been quite clear what the requirements are to run VMS.
Any variation from that is unsupported.  We recognize that there are
likely configurations
that are technically unsupported, but will still likely work.
Preventing those
configurations from working is someething we could do, but chose not to.
Post by Simon Clubley
The other possibility is that VMS is _supposed_ to work OK in this
configuration, but this specific VM setup has been untested by VSI until
now. That means there is a bug in the VMS code itself which needs fixing.
We are not claiming support for Proxmox, although that testing has begun.
Given that it is a KVM-based hypervisor, getting it fully supported should not
be difficult, but we're not there yet.
It is for these reasons that we've been quite conservative about what is supported.
We are interested in any feedback we get, but that doesn't mean we're
going to respond to every
problem immediately when it's an unsupported configuration.
At the Connect IT Symposium, Thilo Lauer (VSI) yesterday gave an
introduction to Proxmox and demoed an OpenVMS instance running under
Proxmox. Camiel Vanderhoeven also mentioned that official support for
Proxmox as a hypervisor is relatively high on their list, especially due
to the changes at VMware since the takeover by Broadcom.
So - stay tuned...
Hans.
PS: Camiel has been appointed "Chief Architect and Strategist" at VSI
recently. In his new role, he will relief Clair Grant from some of his
responsibilities.
Interesting. Personally, I'd really like to see OpenVMS get
official support on Oxide hardware. https://oxide.computer/

- Dan C.
Simon Clubley
2024-04-08 12:27:53 UTC
Permalink
Post by motk
Everything is going great.
$ help analyze
Improperly handled condition, bad stack or no handler specified.
Signal arguments: Number = 0000000000000005
Name = 000000000000000C
0000000000000004
00000000000007D8
FFFF830007C02367
0000000000000012
RAX = 000000007FF9DC80 RDI = 000000007FF9DC80 RSI = 00000000000007D8
RDX = 0000000000000000 RCX = 00000000000007D8 R8 = 00000000FFFF8F84
R9 = 000000000808080D RBX = 000000007FFABE00 RBP = 000000007FF9E4A0
R10 = 000000007FFABDB0 R11 = 000000007FFA4D18 R12 = 000000007FF9C648
R13 = 0000000000000018 R14 = 000000007FF9C800 R15 = 0000000000008301
RIP = FFFF830007C02367 RSP = 000000007FF9E440 SS = 000000000000001B
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=000000000000000C, PC=0000000000000002, PS=7AD44D2F
Executive mode again (_if_ I am reading the PS register correctly and if
the current mode bits are in the same place as on Alpha.)

I wonder if RMS (or the XQP) has managed to corrupt your disk somehow.

Can you make the system disk available to a second instance and run an
$ anal/disk on it from that second instance ?

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
motk
2024-04-09 00:23:28 UTC
Permalink
On 4/8/24 22:27, Simon Clubley wrote:

[ ... ]

n (_if_ I am reading the PS register correctly and if
Post by Simon Clubley
the current mode bits are in the same place as on Alpha.)
I wonder if RMS (or the XQP) has managed to corrupt your disk somehow.
Can you make the system disk available to a second instance and run an
$ anal/disk on it from that second instance ?
Sounds like a plan, I'll give it a go.
--
motk
motk
2024-04-06 00:36:42 UTC
Permalink
Post by Simon Clubley
The exit status on the accounting log may be interesting.
Try a "$ acc/since=<whatever>/full" and once you have found the correct
session, have a look at the exit status.
This seems relevant:

Queue entry: Final status code: 1000000C
Queue name:
Job name:
Final status text: %SYSTEM-F-ACCVIO, access violation, reason mask=!XB,
virtual address=!XH, PC=!XH

I remembered vaguely how to drive sysgen, and have enabled SYSTEM_CHECK
now.
Post by Simon Clubley
Simon.
--
motk
Simon Clubley
2024-04-08 12:11:12 UTC
Permalink
Post by motk
Post by Simon Clubley
The exit status on the accounting log may be interesting.
Try a "$ acc/since=<whatever>/full" and once you have found the correct
session, have a look at the exit status.
Queue entry: Final status code: 1000000C
Final status text: %SYSTEM-F-ACCVIO, access violation, reason mask=!XB,
virtual address=!XH, PC=!XH
I remembered vaguely how to drive sysgen, and have enabled SYSTEM_CHECK
now.
Is there anything in the error log ?

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
motk
2024-04-09 00:32:54 UTC
Permalink
Post by Simon Clubley
Post by motk
I remembered vaguely how to drive sysgen, and have enabled SYSTEM_CHECK
now.
Is there anything in the error log ?
I was trying to check that when it carked it last time :)
--
motk
Tony Nicholson
2024-04-02 16:40:15 UTC
Permalink
Post by Jouk Jansen
I did not get that E-mail (should have got it because I got CLP licenses).
Seems no vmdk for me.
Go to https://vmssoftware.com/products/licenses/ where you'll need to
reapply for the Community license for VSI OpenVMS x86_64.

I did this last Sunday and received an e-mail with download details for
the vdmk disk image on Tuesday.

Tony
John H. Reinhardt
2024-04-03 14:16:09 UTC
Permalink
Post by Jouk Jansen
I did not get that E-mail (should have got it because I got CLP licenses).
Seems no vmdk for me.
Go to https://vmssoftware.com/products/licenses/ where you'll need to reapply for the Community license for VSI OpenVMS x86_64.
I did this last Sunday and received an e-mail with download details for
the vdmk disk image on Tuesday.
Tony
From Mister Moderator on the VSI Forum:

"This week already we have processed over 1000 applications to get access to the community instance of x86 OpenVMS for the VMDK. This included some old applications as well so please check your inboxes and spam folders. If you have not received one yet and have only just applied then please be patient as we will get to you and thankfully the process is a lot faster than it once was."

Hopefully everyone is getting their email.
--
John H. Reinhardt
Loading...