Discussion:
DECnet Phase IV broken after VSI update
Add Reply
Rich Jordan
2021-10-29 17:17:30 UTC
Reply
Permalink
Asking here because the mechanics of asking VSI are not in place yet.

We're doing testing for updating a customer system from an HP VMS V8.4 on RX3600 box to VSI V8.4-2L1 on an RX2800; same setup we were asking about disk partitioning before. Starting with restoring clean image backups of the production box, we installed the upgrade, along with updating DECnet Phase IV, TCPIP, and the other relevant item on the installation media. DECnet was working with the HP VMS version.

Since then we can't get DECnet to work. TCPIP seems to be ok but the DECnet circuit remains on-synchronizing. We have wiped the DECnet config and rebuilt it, same results. It sees all four NICs on the RX2800, and we have tried making each one in sequence the active connection, with the same results. We have confirmed nothing broke on the network by restoring the HP VMS version to another disk; works fine, and using a fresh install of VSI VMS on another disk; both have working DECnet..

This is our first VSI update (as opposed to install). I have a vague recollection of some kind of DECnet problem in early VSI days but have not found the reference yet. Anyone have info or a link?

Due to unavailability of some of the older software we really would prefer to stick with an upgrade over a fresh install and then try to get all their third party products working. DECnet is strongly desired for the test and eventual move from the 3600 to the 2800; after that it won't matter.

thanks
Dave Froble
2021-10-29 18:14:44 UTC
Reply
Permalink
Post by Rich Jordan
Asking here because the mechanics of asking VSI are not in place yet.
We're doing testing for updating a customer system from an HP VMS V8.4 on RX3600 box to VSI V8.4-2L1 on an RX2800; same setup we were asking about disk partitioning before. Starting with restoring clean image backups of the production box, we installed the upgrade, along with updating DECnet Phase IV, TCPIP, and the other relevant item on the installation media. DECnet was working with the HP VMS version.
Since then we can't get DECnet to work. TCPIP seems to be ok but the DECnet circuit remains on-synchronizing. We have wiped the DECnet config and rebuilt it, same results. It sees all four NICs on the RX2800, and we have tried making each one in sequence the active connection, with the same results. We have confirmed nothing broke on the network by restoring the HP VMS version to another disk; works fine, and using a fresh install of VSI VMS on another disk; both have working DECnet..
This is our first VSI update (as opposed to install). I have a vague recollection of some kind of DECnet problem in early VSI days but have not found the reference yet. Anyone have info or a link?
Due to unavailability of some of the older software we really would prefer to stick with an upgrade over a fresh install and then try to get all their third party products working. DECnet is strongly desired for the test and eventual move from the 3600 to the 2800; after that it won't matter.
thanks
Recently I installed VSI VMS V8.4 2L3 on an RX2660 system.
DECnet is not working.
Message from user DECNET on ITANIC
DECnet event 4.7, circuit down, circuit fault
From node 1.37 (ITANIC), 29-OCT-2021 13:52:32.73
Circuit EWA-1, Line synchronization lost
Ok, from a TelNet session (sorry Steve, I like TelNet for local sessions)
I decided to run STARTNET.COM, and DECnet came up and for the first time
worked on this system.
Then I decided to reboot and see what happens. As soon as I got the iLO
into console mode, before logging in, DECnet died.
I haven't yet contacted VSI about the DECnet problem because I was a
bit busy doing stuff that didn't really need DECnet, though is
would have been nice to have it for VMS to VMS work. But now it
has me interested. Why would getting the iLO into console mode
cause a DECnet problem. if that is actually what happened?
Ok, support call logged with VSI. DECnet person not available until Monday.
I'll post what is discovered.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Rich Jordan
2021-10-29 18:59:35 UTC
Reply
Permalink
Rich,
so you're saying: DECnet IV works after a fresh install of VSI VMS V8.4-2L1, but doesn't after an upgrade from HP VMS V8.4 ?
Any DECnet OPCOM messages ?
$ MC NCP SHOW KNO LINE COUNT
$ MC NCP SHOW KNO CIRC COUNT
DECnet license active ?
TCPIP started AFTER DECnet ?
Volker.
Volker
License is active as part of VMS FOE pak; DECnet is in end-node state.
TCPIP was not started during this boot; it was manually started in one previous to test, and it is after DECnet startup in the startup procedures.
I DO need to test it again to make sure I'm not reporting a false positive about its status but I'm swamped, later today or next week to confirm that.

The line count is all clear except

Line = EIA-0
...
4103 Send failure, including:
Carrier check failed


The circuit count is clear except

Circuit = EIA-0
31 Circuit Down

Note that I had the circuit turned off to stop all the operator messages, and it restarted the count when I re-enabled it to post this. Its up to 55 now.

I tried a different switch and cable also, no difference, and again both old and new switch worked fine when booted from old HP VMS or the fresh install of VSI VMS.


I'll follow up with a TCPIP startup and test as soon as I can; everyone decided to have their peecees do diaper dumps today.

I am not currently connected to the iLO but the iLO _is_ connected to the same switch as the server data link (only one server NIC has been connected at a time).
cao...@pitbulluk.org
2021-10-29 21:53:32 UTC
Reply
Permalink
Rich,
so you're saying: DECnet IV works after a fresh install of VSI VMS V8.4-2L1, but doesn't after an upgrade from HP VMS V8.4 ?
Any DECnet OPCOM messages ?
$ MC NCP SHOW KNO LINE COUNT
$ MC NCP SHOW KNO CIRC COUNT
DECnet license active ?
TCPIP started AFTER DECnet ?
Volker.
Volker
License is active as part of VMS FOE pak; DECnet is in end-node state.
TCPIP was not started during this boot; it was manually started in one previous to test, and it is after DECnet startup in the startup procedures.
I DO need to test it again to make sure I'm not reporting a false positive about its status but I'm swamped, later today or next week to confirm that.
The line count is all clear except
Line = EIA-0
...
Carrier check failed
The circuit count is clear except
Circuit = EIA-0
31 Circuit Down
Note that I had the circuit turned off to stop all the operator messages, and it restarted the count when I re-enabled it to post this. Its up to 55 now.
I tried a different switch and cable also, no difference, and again both old and new switch worked fine when booted from old HP VMS or the fresh install of VSI VMS.
I'll follow up with a TCPIP startup and test as soon as I can; everyone decided to have their peecees do diaper dumps today.
I am not currently connected to the iLO but the iLO _is_ connected to the same switch as the server data link (only one server NIC has been connected at a time).
I've recently upgraded to VSI V8.4-2L3 on an RX4640 from L2 without any DECnet IV issues over ethernet circuits, except for issues with LAT. I solved that problem
by using separate interfaces for DECnet, TCP-IP and LAT.

Keith
Volker Halle
2021-10-30 07:06:56 UTC
Reply
Permalink
Rich,
Post by Rich Jordan
Line = EIA-0
...
Carrier check failed
This does explain the 'on-synchronizing' state of the DECnet circuit. But WHY can't the interface send any packets ?

Let me repeat my question: a new installation of VSI OpenVMS V8.4-2L1 does work ? Please confirm.

Is SYS$EIDRIVER the SAME version/link date in both cases ?

Does LANCP> SHOW DEV/COUNT/INTERNAL EIA0 provide any additional information ? Please consider to compare the output of this command for the working and failing case.

How about SDA> SHOW LAN/DEV=EIA
and
SDA> LAN TRACE/DEVICE=EIA

I wouldn't expect TCPIP to work over this LAN interface, if the driver can't send any packets at all !

Volker.
Rich Jordan
2021-10-31 03:11:46 UTC
Reply
Permalink
Post by Volker Halle
Rich,
Post by Rich Jordan
Line = EIA-0
...
Carrier check failed
This does explain the 'on-synchronizing' state of the DECnet circuit. But WHY can't the interface send any packets ?
Let me repeat my question: a new installation of VSI OpenVMS V8.4-2L1 does work ? Please confirm.
Is SYS$EIDRIVER the SAME version/link date in both cases ?
Does LANCP> SHOW DEV/COUNT/INTERNAL EIA0 provide any additional information ? Please consider to compare the output of this command for the working and failing case.
How about SDA> SHOW LAN/DEV=EIA
and
SDA> LAN TRACE/DEVICE=EIA
I wouldn't expect TCPIP to work over this LAN interface, if the driver can't send any packets at all !
Volker.
Volker
sorry been tied up on a new site network install that ate the last few days and will likely consume Sunday as well. I will check and followup Monday, don't have remote access to the RX2800 right now, but I can confirm that yes, DECnet did work on a fresh install of V8.4-2L1 from the ISO, was working with HP V8.4 from an image restore from the customer system, and does not work after upgrading HP to VSI using the same ISO. I should be able to confirm TCPIP function, reconfirming the DECnet function of the fresh install and getting the other requested info before getting sidetracked again. This project is becoming cursed with endless interruptions for other stuff.

Rich
Volker Halle
2021-10-31 11:50:20 UTC
Reply
Permalink
Rich,

and $ SHOW DEV/FULL EIA0 should easily tell you, whether OpenVMS believes the link is up or not.

Volker.
Rich Jordan
2021-11-01 17:27:55 UTC
Reply
Permalink
Rich,
and $ SHOW DEV/FULL EIA0 should easily tell you, whether OpenVMS believes the link is up or not.
Volker.
Testing this AM
I booted the fresh VSI install and started DECnet.
Decnet IV works. I can set host and copy files to the test Alphaserver on the network.
SYS$EIDRIVER reports Ident X-5, Link date 27-JUL-2016 10:30:17.55, Build Ident XE3A-H4N-000000

TCPIP also worked.


Then booted the upgraded system disk (HP V8.4 to VSI)

SYS$EIDRIVER is the same image version and link date (from 2016) on both system disks.

The LANCP internal counters show the same thing referred above; the upgrade system has a rising carrier check failure count, and zero packets received/sent, the VSI installed system shows normal data counters and no errors.

I did not boot the original (restored, pre-upgrade) HP V8.4 disk but it did work fine a couple of weeks ago, the first time since we used DECnet to make network backups over to the test alpha (safety copies) prior to upgrading; this after modifying the system to handle the different device names on the new server. I'll try overlaying the parameter file later if time and interrupts allow. Ditto for SDA output; I dumped the upgraded system's SHOW LAN/DEV to a file to review but my internals and crash dump analysis days ended 15+ years ago; we'll see what we can make out. The TRACE output mostly shows a series of Check Link entries with occasional Transmit Errors.

And sorry. I must have misremembered with all the testing and reboots. TCPIP does NOT work on the system that was upgraded from HP V8.4 to VSI V8.4-2L1. It has barely been used since DECnet is so much easier moving files back and forth from the test Alpha that has connections to the main network. The new RX2800 is on an isolated network with no direct outside access.

I threw the question out here after getting yanked off this work too many times and misquoted my notes. So both DECnet and TCPIP are down after the upgrade. Tried testing with LAT also after enabling outbound connections on both machines, and they can't see each-other.

So its not a DECnet specific issue.

Rich
Rich Jordan
2021-11-01 22:40:31 UTC
Reply
Permalink
Rich,
thanks for testing and sharing your notes. So it really looks like an upgrade vs. install issue. And it affects the LAN interface, irrelevant of the protocol used.
Is the 'link LED' lit on the physical LAN interface ?
SHOW DEV/FULL EIA0 shows 'no link' - right ?
Please test with the system parameter file from the freshly installed system disk.
Volker.
Volker
the sho dev command does show link down. The link light is lit on the port and the switch with the activity light winking once every 12-14 seconds.

I will try the parameter file overwrite tomorrow if at all possible. Got another priority interrupt to work on now.

Rich
Rich Jordan
2021-11-05 23:52:33 UTC
Reply
Permalink
Rich,
thanks for testing and sharing your notes. So it really looks like an upgrade vs. install issue. And it affects the LAN interface, irrelevant of the protocol used.
Is the 'link LED' lit on the physical LAN interface ?
SHOW DEV/FULL EIA0 shows 'no link' - right ?
Please test with the system parameter file from the freshly installed system disk.
Volker.
Replacing the parameter file on the upgraded system with the one from the clean VSI install did not make a difference WRT the network; it did cause some complaints because DECwindows could not start, but everything else did after I updated SCSNODE and SCSSYSTEMID in the copied file to match the upgraded system's name and ID.
I tried a second NIC port for the hell of it, but none of them come up. Counters show the same errors as before.
I'm contacting the customer's VSI rep to find out how we can put in a support request; hopefully that side of things has been taken care of.
Rich
I don't know if this is your problem. It was my problem.
By default, DECnet Phase IV installation and configuration will enable DECnet protocol on all available interfaces on the system. Once configured, the system administrator would want to go into NCP and purge all lines and circuits that are not needed from the database.
$ MCR NCL PURGE CIRCUIT EWA-1 ALL
$ MCR NCL PURGE LINE EWA-1 ALL
Then use STARTNET to start DECnet.
--
David Froble Tel: 724-529-0450
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
VSI found it, though there's still a mystery of sorts attached.

The original system also has EIA-0 (and 1) NICs. The EIA-0 is set to 1Gbs full duplex, auto negotiation disabled, and the corresponding Cisco port is set the same. This is because they've had the system long enough that they experienced the bad behavior between VMS and switches doing auto negotiations in the dim past.

When we restored the image backup to the RX2800s, it brought that config along, and the RX2800 also uses the EI devices. Since this is just offsite testing, we're using a dumb as a stump GbE switch for the uplink. While running HP VMS, apparently it did not matter; the connection came up at GbE speed. After the upgrade, there was a problem with it. VSI had me purge the device from LANCP (we had already done that from DECNet MCP per Volker's suggestion, without success) and reboot. When the device came up with default settings, the link came up and it worked.

Huzzah.

The mystery is that we had already tried using the other three NIC ports on this box, all of which had auto-negotiation enabled, on the VSI upgraded system, and none of them would come up, though I'm pretty sure they were all set to autonegotiate enabled. The second NIC on the production RX3600 is not in use but has autonegotiation enabled. All three of them had the same issue being unable to bring up the link.

So I'm not sure why this worked but it did. I may test the other NICs for the hell of it but we've been delayed too much by this so probably have to just keep working on the transition now.

Thanks for all the replies and suggestions.
Volker Halle
2021-11-06 07:34:39 UTC
Reply
Permalink
Rich,

you might have been able to spot this yourself, if comparing the 'LAN Driver Messages' section (bottom couple of lines of LANCP> SHOW DEV/INTERNAL EIA0) between the good and the bad case.

Volker.
Rich Jordan
2021-11-07 00:31:17 UTC
Reply
Permalink
Rich,
you might have been able to spot this yourself, if comparing the 'LAN Driver Messages' section (bottom couple of lines of LANCP> SHOW DEV/INTERNAL EIA0) between the good and the bad case.
Volker.
Possibly. Also if I'd had it connected to a decent switch. If I have a chance I'll set the port back to fixed speed with autonegotiate disabled and see if usable messaging shows up, but I doubt there will be time. Ditto temporarily plugging it into a Meraki switch port to see what is says.

But while the problem was occurring, we tried the other three NIC ports, all of which had auto negotiation enabled and no hard codes for speed or duplex, and all of them failed the same way as EIA-0 for DECnet, and showed link down on the device. On two different desktop GbE switches (which are toys, so might be an issue though they worked with HP VMS initially). I did not plug into one of the Cisco or Meraki switches, though, which might have provided usable diagnostics. It is entirely possible the desktop switches are nonconforming and perhaps the engineer Robert mentioned has fixed things to the point that those switches are no longer good enough. In production the server will be connected to Cisco Catalyst switches; I just don't have enough ports here to handle the segregated network that the server, a test alpha, and a monitor/terminal emulator PC is on except by using the desktop switch.
Robert A. Brooks
2021-11-06 13:52:03 UTC
Reply
Permalink
Post by Rich Jordan
VSI found it, though there's still a mystery of sorts attached.
The original system also has EIA-0 (and 1) NICs. The EIA-0 is set to 1Gbs
full duplex, auto negotiation disabled, and the corresponding Cisco port is
set the same.
This is because they've had the system long enough that they
experienced the bad behavior between VMS and switches doing auto negotiations
in the dim past.
Unless they are referring back to the mid-90's when the early PCI Ethernet
adapters on Alphas were not-so-great, that info is a bit stale.

VMS Engineering (specifically, the guy who's been writing our Ethernet drivers
for over 30 years) has stated that auto-negotiate should always be used.

If it doesn't work, he'll fix it, or determine that the switch is non-conforming
to the standard.
--
-- Rob
Stephen Hoffman
2021-11-06 16:28:58 UTC
Reply
Permalink
Post by Robert A. Brooks
Post by Rich Jordan
VSI found it, though there's still a mystery of sorts attached.
The original system also has EIA-0 (and 1) NICs. The EIA-0 is set to
1Gbs full duplex, auto negotiation disabled, and the corresponding
Cisco port is set the same.
This is because they've had the system long enough that they
experienced the bad behavior between VMS and switches doing auto
negotiations in the dim past.
Unless they are referring back to the mid-90's when the early PCI
Ethernet adapters on Alphas were not-so-great, that info is a bit stale.
VMS Engineering (specifically, the guy who's been writing our Ethernet
drivers for over 30 years) has stated that auto-negotiate should always
be used.
If it doesn't work, he'll fix it, or determine that the switch is
non-conforming to the standard.
That's been becoming the new policy since ~Y2K or so—though with some
wrinkles around Alpha and Itanium NICs—then GbE controllers and
late-era Fast Ethernet that are detected with auto-negotiate disabled
should generate an informational message at OpenVMS boot, in the logs,
when viewed within LANCP, and within the documentation. For important
network switch settings preferences, I'd be included to post driver
status information to end-users via SHOW DEVICE /FULL, and AMDS/AM, too.

The distribution of this information—and of other analogous
recommendations for many other API choices available—has been
inconsistent, at best. An API with choices needs to have published
opinions, and best has diagnostics when the existing settings are
drifting out of current preferences. Y'all want us pesky customers to
move in certain shorter-term or longer-term directions, y'all need to
tell us that. WTFM, minimally. Displaying diagnostics is preferred.

If y'all as developers don't have an opinion for an API or settings
choice, there shouldn't be an API or settings choice. And preferences
can shift over time, which means shifting our usages.

Unfortunately for this and similar cases where the end-user really
intends to have a bogus setting—this because there's a busted switch
port or busted switch firmware or whatever—OpenVMS also lacks a means
to provide overt alert messaging and to then suppress the the overt
displays over time, moving the displays to status-related cases. Such
as into LANCP, here. That'll probably require some updates to the
existing 1970s- and 1980s-era diagnostics and status-reporting
infrastructure.
--
Pure Personal Opinion | HoffmanLabs LLC
Dave Froble
2021-11-06 19:32:21 UTC
Reply
Permalink
Post by Rich Jordan
VSI found it, though there's still a mystery of sorts attached.
The original system also has EIA-0 (and 1) NICs. The EIA-0 is set to 1Gbs full duplex, auto negotiation disabled, and the corresponding Cisco port is set the same.
This is because they've had the system long enough that they experienced the bad behavior between VMS and switches doing auto negotiations in the dim past.
Unless they are referring back to the mid-90's when the early PCI Ethernet adapters on Alphas were not-so-great, that info is a bit stale.
VMS Engineering (specifically, the guy who's been writing our Ethernet drivers for over 30 years) has stated that auto-negotiate should always be used.
If it doesn't work, he'll fix it, or determine that the switch is non-conforming to the standard.
That's been becoming the new policy since ~Y2K or so—though with some wrinkles around Alpha and Itanium NICs—then GbE controllers and late-era Fast Ethernet that are detected with auto-negotiate disabled should generate an informational message at OpenVMS boot, in the logs, when viewed within LANCP, and within the documentation. For important network switch settings preferences, I'd be included to post driver status information to end-users via SHOW DEVICE /FULL, and AMDS/AM, too.
The distribution of this information—and of other analogous recommendations for many other API choices available—has been inconsistent, at best. An API with choices needs to have published opinions, and best has diagnostics when the existing settings are drifting out of current preferences. Y'all want us pesky customers to move in certain shorter-term or longer-term directions, y'all need to tell us that. WTFM, minimally. Displaying diagnostics is preferred.
If y'all as developers don't have an opinion for an API or settings choice, there shouldn't be an API or settings choice. And preferences can shift over time, which means shifting our usages.
Unfortunately for this and similar cases where the end-user really intends to have a bogus setting—this because there's a busted switch port or busted switch firmware or whatever—OpenVMS also lacks a means to provide overt alert messaging and to then suppress the the overt displays over time, moving the displays to status-related cases. Such as into LANCP, here. That'll probably require some updates to the existing 1970s- and 1980s-era diagnostics and status-reporting infrastructure.
I've got to second this concept. An example:

With one exception, every VMS system I set up had one ethernet port. The exception
is my AlphaServer 800, which had a 4 ethernet port card when I got it. After having
problems, I pulled out the 4 port card and installed a DE500-BA single port card.
Things worked, and I didn't look further.

One of your fine support people mentioned to me:

By default, DECnet Phase IV installation and configuration will enable DECnet protocol on all available interfaces on the system. Once configured, the system administrator would want to go into NCP and purge all lines and circuits that are not needed from the database.

I never knew that.

When setting up DECnet, perhaps in NETCONFIG, or elsewhere, something
could be mentioned about that issue.

Just one example of how to make VMS more user friendly.

And yes, I'm aware, the list of such "hints" could be quite extensive.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Lawrence D’Oliveiro
2021-11-07 01:52:22 UTC
Reply
Permalink
} By default, DECnet Phase IV installation and configuration will enable DECnet protocol
} on all available interfaces on the system. Once configured, the system administrator
} would want to go into NCP and purge all lines and circuits that are not needed from
} the database.
I never knew that.
There is something to be said for the convention that a subsystem will, by default, *not* look at any hardware unless it has been explicitly configured to do so.
John Wallace
2021-11-07 09:59:36 UTC
Reply
Permalink
Post by Dave Froble
Post by Stephen Hoffman
Post by Robert A. Brooks
Post by Rich Jordan
VSI found it, though there's still a mystery of sorts attached.
The original system also has EIA-0 (and 1) NICs.  The EIA-0 is set
to 1Gbs full duplex, auto negotiation disabled, and the
corresponding Cisco port is set the same.
This is because they've had the system long enough that they
experienced the bad behavior between VMS and switches doing auto
negotiations in the dim past.
Unless they are referring back to the mid-90's when the early PCI
Ethernet adapters on Alphas were not-so-great, that info is a bit stale.
VMS Engineering (specifically, the guy who's been writing our
Ethernet drivers for over 30 years) has stated that auto-negotiate
should always be used.
If it doesn't work, he'll fix it, or determine that the switch is
non-conforming to the standard.
That's been becoming the new policy since ~Y2K or so—though with some
wrinkles around Alpha and Itanium NICs—then GbE controllers and
late-era Fast Ethernet that are detected with auto-negotiate disabled
should generate an informational message at OpenVMS boot, in the logs,
when viewed within LANCP, and within the documentation. For important
network switch settings preferences, I'd be included to post driver
status information to end-users via SHOW DEVICE /FULL, and AMDS/AM, too.
The distribution of this information—and of other analogous
recommendations for many other API choices available—has been
inconsistent, at best. An API with choices needs to have published
opinions, and best has diagnostics when the existing settings are
drifting out of current preferences. Y'all want us pesky customers to
move in certain shorter-term or longer-term directions, y'all need to
tell us that. WTFM, minimally. Displaying diagnostics is preferred.
If y'all as developers don't have an opinion for an API or settings
choice, there shouldn't be an API or settings choice. And preferences
can shift over time, which means shifting our usages.
Unfortunately for this and similar cases where the end-user really
intends to have a bogus setting—this because there's a busted switch
port or busted switch firmware or whatever—OpenVMS also lacks a means
to provide overt alert messaging and to then suppress the the overt
displays over time, moving the displays to status-related cases.  Such
as into LANCP, here. That'll probably require some updates to the
existing 1970s- and 1980s-era diagnostics and status-reporting
infrastructure.
With one exception, every VMS system I set up had one ethernet port.
The exception
is my AlphaServer 800, which had a 4 ethernet port card when I got it.
After having
problems, I pulled out the 4 port card and installed a DE500-BA single port card.
Things worked, and I didn't look further.
By default, DECnet Phase IV installation and configuration will enable
DECnet protocol on all available interfaces on the system.  Once
configured, the system administrator would want to go into NCP and purge
all lines and circuits that are not needed from the database.
I never knew that.
When setting up DECnet, perhaps in NETCONFIG, or elsewhere, something
could be mentioned about that issue.
Just one example of how to make VMS more user friendly.
And yes, I'm aware, the list of such "hints" could be quite extensive.
For some relatively brief period during the life of multi-port adapters
on PCI, and not just multiport network adapters, some PCI cards used PCI
to PCI bridges to provide multiple adapters on one card.

That introduced a whole load of fun for the affected adapters, as the
rules for configuring stuff behind a PCI bridge weren't particularly
clear at the time.

Pulling out your four port adapter and replacing it with a single port
adapter, in the AlphaServer 800 era, *might* have unknowingly fixed that
problem too.

Also, some systems used a PCI-PCI bridge on the *motherboard* to provide
an increased number of PCI slots. This is back in the days of e.g. Miata
and MiataGL and similar.

Back then, some considerable time ago, PCI-PCI bridges in general were a
bit of a challenge.

Nowadays the HYPErvisor presumably solves all this device support
weirdness, leaving just the DECnet bits to be sorted in your picture.
Simon Clubley
2021-11-06 20:42:14 UTC
Reply
Permalink
Post by Robert A. Brooks
Unless they are referring back to the mid-90's when the early PCI Ethernet
adapters on Alphas were not-so-great, that info is a bit stale.
VMS Engineering (specifically, the guy who's been writing our Ethernet drivers
for over 30 years) has stated that auto-negotiate should always be used.
If it doesn't work, he'll fix it, or determine that the switch is non-conforming
to the standard.
That's a very interesting way of expressing that and leads to a more
interesting general question:

Does VMS support hardware which doesn't correctly implement a standard
(by implementing a workaround as Linux tends to do), or has VMS Engineering
over the decades outright said that it doesn't follow the standards,
so it's broken, so we won't support it ?

If it's the latter, is that going to change for x86-64 VMS, given some
of the hardware out there ?

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Bill Gunshannon
2021-11-06 20:46:45 UTC
Reply
Permalink
Post by Simon Clubley
Post by Robert A. Brooks
Unless they are referring back to the mid-90's when the early PCI Ethernet
adapters on Alphas were not-so-great, that info is a bit stale.
VMS Engineering (specifically, the guy who's been writing our Ethernet drivers
for over 30 years) has stated that auto-negotiate should always be used.
If it doesn't work, he'll fix it, or determine that the switch is non-conforming
to the standard.
That's a very interesting way of expressing that and leads to a more
Does VMS support hardware which doesn't correctly implement a standard
(by implementing a workaround as Linux tends to do), or has VMS Engineering
over the decades outright said that it doesn't follow the standards,
so it's broken, so we won't support it ?
If it's the latter, is that going to change for x86-64 VMS, given some
of the hardware out there ?
I don't think that's an issue as it has been stated from the
beginning that VMS was not going to support everything but
only a certain subset of systems and components.

bill
Stephen Hoffman
2021-11-06 22:44:56 UTC
Reply
Permalink
Post by Simon Clubley
Does VMS support hardware which doesn't correctly implement a standard
(by implementing a workaround as Linux tends to do), or has VMS
Engineering over the decades outright said that it doesn't follow the
standards, so it's broken, so we won't support it ?
DE500 Fast Ethernet support was... turbulent... during the hardware
transition to auto-negotiation, and was a mixed bag around
auto-negotiation.

As were some of the other NICs in that era. Later NICs worked better,
and GbE NICs do much better with both speed and duplex. Earlier NICs
can be hit or miss, and more than a few were settings-locked.

In the antediluvian era of networking, locking NIC settings was
recommended for specific configurations, too.
Post by Simon Clubley
If it's the latter, is that going to change for x86-64 VMS, given some
of the hardware out there ?
I would expect to be rid of DE500 NICs when transitioning over to
x86-64. Not at current GbE NIC prices and DE500 speeds, for those cases
where the GbE or 10 GbE NIC isn't already server-integrated.

There are *lots* of ways to get in trouble with OpenVMS. Where it'll
fail with weird errors. Duplicate MAC address detection checks were
added there, and can catch some configuration issues. There are others.

DHCP client support in OpenVMS was problematic for instance and may
still be (it's been a ~decade since I've bothered to try it), and
hopefully that will be improving with what VSI has been and will be
working on.

IPv6 support on OpenVMS similarly needs some help.

For those of you wondering how to find and learn about the more
meddlesome areas? Postings around here, of course. There are other
ways. Over several years, there was a well-done and well-presented
series of boot camp technical sessions offered by some HP/HPE folks,
describing how to implement and debug and work around what was clearly
some poorly-documented and seemingly busted OpenVMS code. Better
documenting and hardening the associated OpenVMS code was seemingly
somehow out of bounds. Looking around for code that has
stupid-complicated configuration requirements and/or stupidly-manual
configuration requirements is another. SMTP server had a wonderful
failure mode for a while and may still, where a missing configuration
file generated no diagnostics and the the SMTP server silently (and
incredibly) defaulted to operating as an open relay. I've previously
pointed to certificate authentication. IPv6. Etc.
--
Pure Personal Opinion | HoffmanLabs LLC
k***@gmail.com
2021-11-06 22:38:42 UTC
Reply
Permalink
-----Original Message-----
via Info-vax
Sent: November-06-21 5:42 PM
Subject: [Info-vax] Working with broken hardware, was: Re: DECnet Phase IV
broken after VSI update
Post by Robert A. Brooks
Unless they are referring back to the mid-90's when the early PCI
Ethernet adapters on Alphas were not-so-great, that info is a bit stale.
VMS Engineering (specifically, the guy who's been writing our Ethernet
drivers for over 30 years) has stated that auto-negotiate should always
be
used.
Post by Robert A. Brooks
If it doesn't work, he'll fix it, or determine that the switch is
non-conforming to the standard.
That's a very interesting way of expressing that and leads to a more
Does VMS support hardware which doesn't correctly implement a standard
(by implementing a workaround as Linux tends to do), or has VMS Engineering
over the decades outright said that it doesn't follow the standards, so
it's
broken, so we won't support it ?
If it's the latter, is that going to change for x86-64 VMS, given some of
the
hardware out there ?
Simon.
--
FWIW .. its still an industry problem (even today) - same issue on Linux and
Windows and Telecom gear in general platforms with old(er) switches and gear
that do not conform to todays standards.

Just google "auto-negotiation problems" ..

Sample -
<https://knowledge.broadcom.com/external/article/167467/why-am-i-getting-aut
onegotiation-problem.html>
"Auto-negotiation problems are common; they result from errors on the
Ethernet devices connected to the appliance, causing dropped packets,
reduced throughput, and session drops. Devices that are connected, such as
the router or a LAN switch could also switch from full-duplex to half-duplex
(and vice versa) because of auto-negotiation problems, resulting in poor
network performance.

In some cases, if a duplex mismatch occurs when the interface is
auto-negotiated and the connection is set to half-duplex, or the
auto-negotiation does not provide the optimal outcome. As a result, manually
setting the duplex setting might be the workaround to avoid this problem."

Since it is likely impossible for a host driver to be able to detect all
related issues (especially in X86 world), I would suggest the driver do a
small internal test and if it detects an issue, simply log the issue on the
console or host error log / event viewer.

Regards,

Kerry Main
Kerry dot main at starkgaming dot com
--
This email has been checked for viruses by AVG.
https://www.avg.com
Dave Froble
2021-11-06 23:06:34 UTC
Reply
Permalink
Post by Simon Clubley
Post by Robert A. Brooks
Unless they are referring back to the mid-90's when the early PCI Ethernet
adapters on Alphas were not-so-great, that info is a bit stale.
VMS Engineering (specifically, the guy who's been writing our Ethernet drivers
for over 30 years) has stated that auto-negotiate should always be used.
If it doesn't work, he'll fix it, or determine that the switch is non-conforming
to the standard.
That's a very interesting way of expressing that and leads to a more
Does VMS support hardware which doesn't correctly implement a standard
(by implementing a workaround as Linux tends to do), or has VMS Engineering
over the decades outright said that it doesn't follow the standards,
so it's broken, so we won't support it ?
If it's the latter, is that going to change for x86-64 VMS, given some
of the hardware out there ?
Simon.
Which then brings up the question, just how many work-arounds do you want?
Perhaps until the code for work-arounds exceeds the code for the OS?

Since you asked, I'd suggest supporting conforming HW, and skip the rest.

If using a VM, the issue probably isn't. For "bare metal" (how did we ever
come up with such an idiot name?) as mentioned elsewhere, x86 VMS will
support a limited set of HW. Probably all new stuff, which most likely
doesn't include past kludges.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2021-11-08 01:16:00 UTC
Reply
Permalink
Post by Dave Froble
If using a VM, the issue probably isn't. For "bare metal" (how did we ever
come up with such an idiot name?) as mentioned elsewhere, x86 VMS will
support a limited set of HW.
It's not an idiotic name, but you _clearly_ don't have any embedded
experience. :-)

It's from the embedded world and means application code that runs
directly on the hardware without any operating system between the
code and the hardware.

In the same way, it was extended to mean operating systems that run
directly on the hardware instead of under some hypervisor.

If you ever see a reference to an operating system running on
"bare metal" while the operating system is in fact running under
a hypervisor and not directly on the hardware, then that's the
marketing people playing idiotic word games to try and make their
solution appear to be something that it is not. Such people can
(and should) be ignored.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Volker Halle
2021-11-01 05:50:39 UTC
Reply
Permalink
Rich,

a major difference between the upgrade and freshly installed system would be the system parameter file. Try copying SYS$SYSTEM:I64VMSSYS.PAR from the 'good' to the 'bad' system disk or vice versa.

Volker.
Phillip Helbig (undress to reply)
2021-10-30 06:27:32 UTC
Reply
Permalink
Ok, from a TelNet session (sorry Steve, I like TelNet for local sessions)
Why not LAT?
Craig A. Berry
2021-10-30 23:00:00 UTC
Reply
Permalink
Post by Phillip Helbig (undress to reply)
Ok, from a TelNet session (sorry Steve, I like TelNet for local sessions)
Why not LAT?
The terminal emulator I use, SmarTerm, does not have LAT as a standard
option.
Actually, I'd like to get the company all using something a bit more
secure.
I'm going to look at SSH.
When you do, you may need to have a look at the part of SYLOGIN.COM that
has the SET TERM/INQUIRE command. In the boilerplate version that comes
with VMS, a rather long list of device types are excluded from running
that command, including "FT" which is what is used by interactive SSH
connections, so by default, VMS won't know what kind of terminal you
have and lots of stuff doesn't work with terminal type "Unknown." If
you delete FT from the list, your interactive sessions will start
getting the correct terminal type, but other things that tunnel over SSH
may break.

In my case Kermit from a Linux system tunneled over SSH stopped working.
I had to exclude the account that was doing that from the check, which
is a pretty desperate hack. If there's a better way to identify
sessions that are using FTDRIVER and have a TT device with a device
class of 66 but are not actual interactive terminal sessions using SSH,
I'd be happy to hear what it is.
Dave Froble
2021-10-30 23:57:39 UTC
Reply
Permalink
Post by Craig A. Berry
Post by Phillip Helbig (undress to reply)
Ok, from a TelNet session (sorry Steve, I like TelNet for local sessions)
Why not LAT?
The terminal emulator I use, SmarTerm, does not have LAT as a standard option.
Actually, I'd like to get the company all using something a bit more secure.
I'm going to look at SSH.
When you do, you may need to have a look at the part of SYLOGIN.COM that
has the SET TERM/INQUIRE command. In the boilerplate version that comes
with VMS, a rather long list of device types are excluded from running
that command, including "FT" which is what is used by interactive SSH
connections, so by default, VMS won't know what kind of terminal you
have and lots of stuff doesn't work with terminal type "Unknown." If
you delete FT from the list, your interactive sessions will start
getting the correct terminal type, but other things that tunnel over SSH
may break.
In my case Kermit from a Linux system tunneled over SSH stopped working.
I had to exclude the account that was doing that from the check, which
is a pretty desperate hack. If there's a better way to identify
sessions that are using FTDRIVER and have a TT device with a device
class of 66 but are not actual interactive terminal sessions using SSH,
I'd be happy to hear what it is.
Thanks, may keep me from some cussing ...

If SYLOGIN is an issue, placing commands in select LOGIN.COM files could be one solution.
Not one I like. But nobody ever promised me a rose garden ...
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Stephen Hoffman
2021-10-31 01:22:27 UTC
Reply
Permalink
Post by Craig A. Berry
When you do, you may need to have a look at the part of SYLOGIN.COM
that has the SET TERM/INQUIRE command. In the boilerplate version that
comes with VMS, a rather long list of device types are excluded from
running that command, including "FT" which is what is used by
interactive SSH connections, so by default, VMS won't know what kind of
terminal you have and lots of stuff doesn't work with terminal type
"Unknown." If you delete FT from the list, your interactive sessions
will start getting the correct terminal type, but other things that
tunnel over SSH may break.
I'm guilty of most of that ginormous DCL abomination, as an earlier
wide-open SET TERMINAL /INQUIRE was blowing up too much, and the
terminal settings changes (particularly around performing that command
on the console, and around clearing screens, IIRC) were derailing the
regression tests. That DCL abomination because SET TERMINAL /INQUIRE
wasn't all that much past a box of rocks at its own intended job. That
all due to upward compatibility. And FT is (also) used for foreign
terminals, which'll blow up other stuff that uses FT.

Pragmatically, ssh should have used its own terminal type, but that's
probably not going to change now.

There are other issues lurking here, such as with the (lack of)
TT_ACCPORNAM handling allowing identifying the source of arriving IP
terminal connections.

It'd be interesting to see if ssh could be adapted to permit VT virtual
terminal support, too.

Dredging up an old and semi-related discussion around DECterm and its
use of FT, David Jones wrote "For FT devices, the field holding
internal PID of the job whose bytlm was charged for the UCB
(UCB$L_CPID) overlays UCB$L_LOCKID, so you can retrieve it using
sys$getdvi with the LOCKID item code. You can convert the internal PID
to an external PID and then sleuth the process that created the
pseudo-terminal (e.g. DECW$TE_xxxx would be a decterm)."

It took a while to get OpenVMS detect the size of the original terminal
window in an OpenVMS session, but at least that's been working for a
while.

SET TERMINAL really shouldn't spew errors either, but the non-ANSI and
busted-emulation terminals will. Old Tektronix certainly did. (4014 was
pre-ANSI and did. 4125 had ANSI and didn't.) It's probably past time to
rework all of this OpenVMS code though, and to adopt that ssh and
telnet (ugh) and OPA0: serial or iLO are our connection paths, and just
let truly fossil-grade terminals spew their errors, but... well...
compatibility.
--
Pure Personal Opinion | HoffmanLabs LLC
V***@SendSpamHere.ORG
2021-10-31 13:20:14 UTC
Reply
Permalink
Post by Stephen Hoffman
Post by Craig A. Berry
When you do, you may need to have a look at the part of SYLOGIN.COM
that has the SET TERM/INQUIRE command. In the boilerplate version that
comes with VMS, a rather long list of device types are excluded from
running that command, including "FT" which is what is used by
interactive SSH connections, so by default, VMS won't know what kind of
terminal you have and lots of stuff doesn't work with terminal type
"Unknown." If you delete FT from the list, your interactive sessions
will start getting the correct terminal type, but other things that
tunnel over SSH may break.
I'm guilty of most of that ginormous DCL abomination, as an earlier
wide-open SET TERMINAL /INQUIRE was blowing up too much, and the
terminal settings changes (particularly around performing that command
on the console, and around clearing screens, IIRC) were derailing the
regression tests. That DCL abomination because SET TERMINAL /INQUIRE
wasn't all that much past a box of rocks at its own intended job. That
all due to upward compatibility. And FT is (also) used for foreign
terminals, which'll blow up other stuff that uses FT.
Pragmatically, ssh should have used its own terminal type, but that's
probably not going to change now.
There are other issues lurking here, such as with the (lack of)
TT_ACCPORNAM handling allowing identifying the source of arriving IP
terminal connections.
$ sh term
Terminal: _FTA99: Device_Type: VT200_Series Owner: SYSTEM
Remote Port Info: ssh/ool-########.dyn.optonline.net:50130

Input: 9600 LFfill: 0 Width: 132 Parity: None
Output: 9600 CRfill: 0 Page: 24

Terminal Characteristics:
Interactive Echo Type_ahead No Escape
Hostsync TTsync Lowercase Tab
No Wrap Scope No Remote Eightbit
Broadcast No Readsync No Form Fulldup
No Modem No Local_echo No Autobaud No Hangup
No Brdcstmbx No DMA No Altypeahd Set_speed
No Commsync Line Editing Overstrike editing No Fallback
No Dialup Secure server No Disconnect No Pasthru
No Syspassword No SIXEL Graphics No Soft Characters No Printer Port
Numeric Keypad ANSI_CRT No Regis No Block_mode
Advanced_video Edit_mode DEC_CRT DEC_CRT2
No DEC_CRT3 No DEC_CRT4 No DEC_CRT5 No Ansi_Color
VMS Style Input <CTRL-H> Backspace

I developed a product over a decade ago to add the ACCPORNAM info to FTA's
used with ssh. I designed it to be flexible for site needs by providing a
set of "formatters" that can be specified to modify how the remote info is
presented in the ACCPORNAM field.
--
VAXman- A Bored Certified VMS Kernel Mode Hacker VAXman(at)TMESIS(dot)ORG

I speak to machines with the voice of humanity.
Lawrence D’Oliveiro
2021-10-31 02:26:26 UTC
Reply
Permalink
Post by Craig A. Berry
In my case Kermit from a Linux system tunneled over SSH stopped working.
What sort of errors was it reporting?
Post by Craig A. Berry
I had to exclude the account that was doing that from the check, which
is a pretty desperate hack. If there's a better way to identify
sessions that are using FTDRIVER and have a TT device with a device
class of 66 but are not actual interactive terminal sessions using SSH,
I'd be happy to hear what it is.
What is the need for FTDRIVER, exactly? Isn’t there some standard pseudo-terminal facility equivalent to pty(7) <https://manpages.debian.org/bullseye/manpages/pty.7.en.html>? On Linux, SSH service is implemented entirely in userland.
Arne Vajhøj
2021-10-31 21:26:18 UTC
Reply
Permalink
But really, all actual physical dumb terminals are museum pieces now
(aren’t they?). All we have left are software-based terminal
emulators running on actual computers. And all these emulators
emulate some member of the VT100 family. So do we still need SET
TERMINAL/INQUIRE? Just assume you’re using a VT100, and be done with
it.
On VMS we will need at least VT200.

Arne
Arne Vajhøj
2021-10-31 21:52:33 UTC
Reply
Permalink
On Monday, November 1, 2021 at 10:26:25 AM UTC+13, Arne Vajhøj
Post by Arne Vajhøj
Just assume you’re using a VT100, and be done with it.
On VMS we will need at least VT200.
What extra capabilities do you need? All the ones that are worth
implementing should already be in the available open-source
emulators.
Emulators are fine. But we need VT200 emulation and terminal
type set to VT200 (or higher).

Arne
Lawrence D’Oliveiro
2021-10-31 21:27:41 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
Post by Craig A. Berry
In my case Kermit from a Linux system tunneled over SSH stopped working.
What sort of errors was it reporting?
None. It just hung. My best guess is the host was waiting for a reply
to the inquiry that it never got.
There are ways to debug that, but ... come to think of it, why are you bothering with Kermit when you have SSH? Because SSH includes secure file transfer capabilities via SCP and SFTP, so why not use those?
Simon Clubley
2021-11-01 18:43:49 UTC
Reply
Permalink
Post by Craig A. Berry
I had to exclude the account that was doing that from the check, which
is a pretty desperate hack. If there's a better way to identify
sessions that are using FTDRIVER and have a TT device with a device
class of 66 but are not actual interactive terminal sessions using SSH,
I'd be happy to hear what it is.
What is the need for FTDRIVER, exactly? Isn?t there some standard pseudo-terminal facility equivalent to pty(7) <https://manpages.debian.org/bullseye/manpages/pty.7.en.html>? On Linux, SSH service is implemented entirely in userland.
As already mentioned, it implements pseudoterminal support on VMS.

However, if you look at the man page you quote, you will see that the
actual pty devices are also created by a kernel mode device driver on
Linux, just as is done with FTDRIVER on VMS.

You will also see that there are even kernel parameters to control the
maximum number of pseudoterminals that exist at any one time on Linux.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Lawrence D’Oliveiro
2021-11-01 22:45:47 UTC
Reply
Permalink
Post by Simon Clubley
However, if you look at the man page you quote, you will see that the
actual pty devices are also created by a kernel mode device driver on
Linux ...
Of which there is only one, which is available to userland processes, and which is suitable for all cases where virtual terminals are needed. Implementing an SSH server? Use that. Implementing a GUI terminal emulator? Use that. Implementing a remote terminal server for some other network/comms protocol? Use that. Implementing a screen scraper to get info out of some legacy proprietary app? Use that.

Is FTDRIVER the same? If yes, then I take everything back, except to wonder why it doesn’t work better. :)
Arne Vajhøj
2021-11-01 23:05:48 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
Post by Simon Clubley
However, if you look at the man page you quote, you will see that
the actual pty devices are also created by a kernel mode device
driver on Linux ...
Of which there is only one, which is available to userland processes,
and which is suitable for all cases where virtual terminals are
needed. Implementing an SSH server? Use that. Implementing a GUI
terminal emulator? Use that. Implementing a remote terminal server
for some other network/comms protocol? Use that. Implementing a
screen scraper to get info out of some legacy proprietary app? Use
that.
Is FTDRIVER the same? If yes, then I take everything back, except to
wonder why it doesn’t work better. :)
FTDRIVER is the pseudo terminal driver. It is not SSH specific for any
particular application.

To quote the docs:

<quote>
Chapter 6. Pseudoterminal Driver
This chapter describes the use of the pseudoterminal driver (FTDRIVER)
and the pseudoterminal
software.
A pseudoterminal is a software device that appears as a real terminal to
an application communicating
with it, but does not require the existence of a physical terminal. A
pseudoterminal consists of two
components: the pseudoterminal device and a control program. The control
program acts like a keyboard;
that is, anything written to the control program appears on the
pseudoterminal device as if the
keystrokes had been typed in at a physical terminal. The control program
also acts like a viewport to
the pseudoterminal device; that is, the control program reads anything
that is written by the system to
the pseudoterminal device.
A pseudoterminal allows an application to be set up on the control side
of the link to communicate
with another application that is on the pseudoterminal side. This
arrangement allows development of
applications that either simulate users or monitor the communication
between a real user (at a physical
terminal) and an application. As with other devices, the work of the
pseudoterminal is performed by a
device driver and is tightly coupled to the operating system.
</quote>

It may still be a bit different in nature from its Linux counterpart due
to the fact that VMS is different from Linux.

Arne
Lawrence D’Oliveiro
2021-11-03 23:55:24 UTC
Reply
Permalink
I’ve been having a look at this <https://vmssoftware.com/docs/VSI_IO_REF.pdf>. Seems this is a little bit different from a regular kernel driver: you need to use special PTD$xxx calls to open and close instances. Note this in the description of PTD$CREATE: “This channel is only intended to be used for PTD$XXX operations.” So there is some kind of extra layer on top of the kernel driver, and you are not supposed to access the latter directly.
Post by Arne Vajhøj
It may still be a bit different in nature from its Linux counterpart due
to the fact that VMS is different from Linux.
Still, VMS has -- or had -- a sufficiently versatile kernel driver architecture that it could have managed this better.
Arne Vajhøj
2021-11-04 00:19:10 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
I’ve been having a look at this
<https://vmssoftware.com/docs/VSI_IO_REF.pdf>. Seems this is a little
bit different from a regular kernel driver: you need to use special
PTD$xxx calls to open and close instances. Note this in the
description of PTD$CREATE: “This channel is only intended to be used
for PTD$XXX operations.” So there is some kind of extra layer on top
of the kernel driver, and you are not supposed to access the latter
directly.
Yes.

PTD$ user mode library ---> some VMS kernel code ---> driver code

Arne
Lawrence D’Oliveiro
2021-11-04 03:51:03 UTC
Reply
Permalink
Post by Arne Vajhøj
Post by Lawrence D’Oliveiro
I’ve been having a look at this
<https://vmssoftware.com/docs/VSI_IO_REF.pdf>. Seems this is a little
bit different from a regular kernel driver: you need to use special
PTD$xxx calls to open and close instances. Note this in the
description of PTD$CREATE: “This channel is only intended to be used
for PTD$XXX operations.” So there is some kind of extra layer on top
of the kernel driver, and you are not supposed to access the latter
directly.
Yes.
PTD$ user mode library ---> some VMS kernel code ---> driver code
And note the implication that bad things could happen if you $DASSGN the channel without calling PTD$DELETE. Or if you tried to $QIO yourself without using PTD$READ/WRITE.

I wonder why it was necessary to do it this way?
David Jones
2021-11-04 04:54:19 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
Post by Arne Vajhøj
... So there is some kind of extra layer on top
of the kernel driver, and you are not supposed to access the latter
directly.
Yes.
PTD$ user mode library ---> some VMS kernel code ---> driver code
And note the implication that bad things could happen if you $DASSGN the channel without calling PTD$DELETE. Or if you tried to $QIO yourself without using PTD$READ/WRITE.
I wonder why it was necessary to do it this way?
Most of PTD$ code runs in kernel mode with some at elevated IPL. The PTD$CREATE call
creates a buffer object for the buffer you supply so that the buffer (and its S0 address)
stays locked in memory. PTD$READ/WRITE queue request packets directly to FTDRIVER,
skipping some of the overhead of $QIO processing.

I wonder if DECnet would be able to use a pseudo-terminal for a DDCMP link and have
the control program tunnel the packets over a secure link of some kind. Unlike LDRIVER,
unfortunately, PTD$CREATE doesn't let you specify the unit number of the created
device.
Arne Vajhøj
2021-11-04 13:23:29 UTC
Reply
Permalink
Post by David Jones
Post by Lawrence D’Oliveiro
Post by Arne Vajhøj
... So there is some kind of extra layer on top
of the kernel driver, and you are not supposed to access the latter
directly.
Yes.
PTD$ user mode library ---> some VMS kernel code ---> driver code
And note the implication that bad things could happen if you
$DASSGN the channel without calling PTD$DELETE. Or if you tried to $QIO yourself
without using PTD$READ/WRITE.
I wonder why it was necessary to do it this way?
Most of PTD$ code runs in kernel mode with some at elevated IPL.
Very interesting.

So it is not a traditional call stack:

user exe application user mode
P0 space
---------- ----------
VMS shrexe PTD$ API
P0 space some logic
---------- ----------
VMS SYS$ API
S0 space CHMK ----------
more logic kernel mode
----------
EXE$ and other
----------
Driver

But:

user exe application user mode
P0 space
---------- ----------
VMS shrexe PTD$ API
P0 space some logic
SYS$CMKRNL ----------
more logic kernel mode
---------- ----------
VMS EXE$ and other
S0 space ----------
Driver

Sounds like somebody did not like SYS$QIO(W).

:-)
Post by David Jones
The PTD$CREATE call
creates a buffer object for the buffer you supply so that the buffer (and its S0 address)
stays locked in memory.
There is nothing unusual in the fact that when doing an "open" at some
level then one need to do a "close" at the same level.
Post by David Jones
PTD$READ/WRITE queue request packets directly to FTDRIVER,
skipping some of the overhead of $QIO processing.
Arne
David Jones
2021-11-04 14:14:02 UTC
Reply
Permalink
Post by Arne Vajhøj
Post by David Jones
Most of PTD$ code runs in kernel mode with some at elevated IPL.
Very interesting.
PTD$SERVICES_SHR.EXE is installed as a protected shareable image, the
functions it export will on entry CHMK in order to process the request
in kernel mode via that image's change mode dispatcher. PTD$READW and
PTD$WRITEW invoke CHMK with the code for the corresponding asynch.
function and then call EXE$SYNCH (same way SYS$QIOW works).
Post by Arne Vajhøj
Sounds like somebody did not like SYS$QIO(W).
Considering how slow the VAX was and that terminal I/O was often many
many small transfers, that kind of micro-optimization was probably worth
while.
Lawrence D’Oliveiro
2021-11-05 01:01:32 UTC
Reply
Permalink
Post by David Jones
Post by Arne Vajhøj
Sounds like somebody did not like SYS$QIO(W).
Considering how slow the VAX was and that terminal I/O was often many
many small transfers, that kind of micro-optimization was probably worth
while.
Remember the old saying, variously attributed to Hoare or Knuth: “premature optimization is the root of all evil”. By making pseudo-terminals a special case, they lost the ability to treat it uniformly as part of a common event loop I/O framework.

Meanwhile, Unix systems had the pty(7) device. Open one, and you get just another file descriptor, and you access it with normal read(2) and write(2) calls, same as any other file descriptor. And you can use it with select/poll/epoll.
Lawrence D’Oliveiro
2021-11-05 00:54:35 UTC
Reply
Permalink
Post by David Jones
Most of PTD$ code runs in kernel mode with some at elevated IPL.
So does a device driver. Why do the PTD functions have to be a separate layer from the device driver? Particularly a separate, caller-visible layer that breaks the normal driver QIO abstraction?
Post by David Jones
PTD$READ/WRITE queue request packets directly to FTDRIVER,
skipping some of the overhead of $QIO processing.
Remember, these are terminals we are talking about -- devices geared to the limits of human I/O bandwidth. Any “overheads” associated with I/O processing would be insignificant compared to the time it takes for a human to type input or read output.
David Jones
2021-11-05 06:46:14 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
Post by David Jones
Most of PTD$ code runs in kernel mode with some at elevated IPL.
So does a device driver. Why do the PTD functions have to be a separate layer from the device driver? Particularly a separate, caller-visible layer that breaks the normal driver QIO abstraction?
Post by David Jones
PTD$READ/WRITE queue request packets directly to FTDRIVER,
skipping some of the overhead of $QIO processing.
Remember, these are terminals we are talking about -- devices geared to the limits of human I/O bandwidth. Any “overheads” associated with I/O processing would be insignificant compared to the time it takes for a human to type input or read output.
That's a PC menatility, VAXes were timesharing systems with many concurrent users. C was a second class
language in the VAX/VMS world, too, so the API wasn't designed to cater the patterns of the C RTL. Threads
didn't exist, your server process handling many PTYs was state driven by ASTs delivered by the terminal
driver.
Craig A. Berry
2021-11-05 15:39:56 UTC
Reply
Permalink
Post by David Jones
Post by Lawrence D’Oliveiro
Post by David Jones
Most of PTD$ code runs in kernel mode with some at elevated IPL.
So does a device driver. Why do the PTD functions have to be a separate layer from the device driver? Particularly a separate, caller-visible layer that breaks the normal driver QIO abstraction?
Post by David Jones
PTD$READ/WRITE queue request packets directly to FTDRIVER,
skipping some of the overhead of $QIO processing.
Remember, these are terminals we are talking about -- devices
geared to the limits of human I/O bandwidth. Any “overheads” associated with
I/O processing would be insignificant compared to the time it takes for
a human to type input or read output.
That's a PC menatility, VAXes were timesharing systems with many
concurrent users. C was a second class language in the VAX/VMS world,
too, so the API wasn't designed to cater the patterns of the C RTL.
Threads didn't exist, your server process handling many PTYs was
state driven by ASTs delivered by the terminal driver.
And I remember lots of complaints from people who had to slow down their
typing so the VAX could catch up.
Bill Gunshannon
2021-11-05 16:11:34 UTC
Reply
Permalink
Post by Craig A. Berry
Post by David Jones
Post by Lawrence D’Oliveiro
Post by David Jones
Most of PTD$ code runs in kernel mode with some at elevated IPL.
So does a device driver. Why do the PTD functions have to be a
separate layer from the device driver? Particularly a separate,
caller-visible layer that breaks the normal driver QIO abstraction?
Post by David Jones
PTD$READ/WRITE queue request packets directly to FTDRIVER,
skipping some of the overhead of $QIO processing.
Remember, these are terminals we are talking about -- devices
geared to the limits of human I/O bandwidth. Any “overheads” associated with
I/O processing would be insignificant compared to the time it takes for
a human to type input or read output.
That's a PC menatility, VAXes were timesharing systems with many
concurrent users. C was a second class language in the VAX/VMS world,
too, so the API wasn't designed to cater the patterns of the C RTL.
Threads didn't exist, your server process handling many PTYs was
state driven by ASTs delivered by the terminal driver.
And I remember lots of complaints from people who had to slow down their
typing so the VAX could catch up.
The only time I ever saw that happen on a VAX was when I was using
EUNICE and someone else fired up the ADA Compiler. :-)

bill
Lawrence D’Oliveiro
2021-11-05 23:59:41 UTC
Reply
Permalink
Post by Bill Gunshannon
The only time I ever saw that happen on a VAX was when I was using
EUNICE and someone else fired up the ADA Compiler. :-)
NYU Ada/Ed, I believe it was called? Written in SETL, and interpreted? And your programs were also compiled to SETL?

The one where, when your program hit an error, it took about 60 seconds† just to crash?

†I timed it.
Arne Vajhøj
2021-11-06 01:39:45 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
Post by Bill Gunshannon
The only time I ever saw that happen on a VAX was when I was using
EUNICE and someone else fired up the ADA Compiler. :-)
NYU Ada/Ed, I believe it was called? Written in SETL, and interpreted? And your programs were also compiled to SETL?
The one where, when your program hit an error, it took about 60 seconds† just to crash?
†I timed it.
VAX Ada seems more likely.

Arne
Lawrence D’Oliveiro
2021-11-06 00:06:20 UTC
Reply
Permalink
... when I was using EUNICE ...
Ah, anybody else remember those HSH0NAEDA.HSH files proliferating on users’ accounts? (Why do I still remember that name...) And how its fork(2) emulation had to be implemented via a special executable called FORKDUMY.EXE or something like that? (Not that I understood how it worked -- I never used it.)

And when you built Perl on any *nix system for years after, one of the messages that could come from the build config system was “Congratulations, you’re not running EUNICE!”?
Craig A. Berry
2021-11-06 01:31:29 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
And when you built Perl on any *nix system for years after, one of
the messages that could come from the build config system was
“Congratulations, you’re not running EUNICE!”?
Still says it today:

<https://github.com/Perl/perl5/blob/4a1b9dd524007193213d3919d6a331109608b90c/Configure#L4519>

We should probably declutter a bit and rip out anything related to
eunice in the Perl sources.
Lawrence D’Oliveiro
2021-11-05 23:57:23 UTC
Reply
Permalink
Post by Craig A. Berry
And I remember lots of complaints from people who had to slow down their
typing so the VAX could catch up.
What was your VAX doing? I’m a fast enough typist I was able to notice the very slight difference in echo time on a VT100 between a LAT connection and a direct serial line. (This before I gave up on real terminals and switched to NCSA Telnet on a Mac.)
Craig A. Berry
2021-11-06 01:37:09 UTC
Reply
Permalink
Post by Lawrence D’Oliveiro
Post by Craig A. Berry
And I remember lots of complaints from people who had to slow down their
typing so the VAX could catch up.
What was your VAX doing?
Lots of circuit simulations, SPICE3 and various home-grown stuff. I
think increasing BYTLM quotas helped a little, but you couldn't increase
it too much on a 2MB VAX.
Stephen Hoffman
2021-10-31 19:18:12 UTC
Reply
Permalink
There is also something called a virtual terminal (VTDRIVER?) but I've
never used it and couldn't find much in the way of documentation just
now.
"It'd be interesting to see if ssh could be adapted to permit VT
virtual terminal support, too."
VT Virtual Terminals are TTDRIVER, reconfigured and operating as
disconnectable.

https://vmssoftware.com/docs/VSI_SYS_MGMT_MANUAL_VOL_I.PDF#page=270

Also see the DISCONNECT and CONNECT commands in DCL.

Getting DECterm to deal with this would be substantially more of an
effort, with all of X along for that dark and scary and
really-just-an-RPC-with-an-attached-windowing-system, ride.

Virtual Terminals are handy when connections are spotty, which used to
be modems and now might be mobile or congested or other spotty network
connections.

The closest client-side analog to the OpenVMS virtual terminal support
is arguably the GNU screen and BSD tmux apps, though these must
continue to be connected with the server and don't otherwise preserve
the server-side session.

For server management, this virtual terminal support was a way to
temporarily back out of a connection without losing context,
fix/restart/rewire/reboot whatever happened with the connection or with
client-side, and re-connect and resume the server session. Such as
happens when the modem drops out. Yeah, modems, how quaint. Which is
why virtual terminal support really hasn't seen much widespread use in
recent decades.

Getting ssh and the rest over to its own terminal device name and off
FT and/or BG would make the whole thing a shade easier to manage using
traditional tools, too. Though I do recall somebody around here has
grumbled once or twice about OpenVMS (dis)integration with IP. OpenVMS
development had its head... elsewhere and too-long-preferred DECnet and
OSI and for way too many years, and the fallout from that era continues
to plague us.

Not that I expect any of this to change this decade, as VSI has other
and higher-priority work.
--
Pure Personal Opinion | HoffmanLabs LLC
Lawrence D’Oliveiro
2021-11-01 06:04:31 UTC
Reply
Permalink
Post by Stephen Hoffman
Getting ssh and the rest over to its own terminal device name and off
FT and/or BG would make the whole thing a shade easier to manage using
traditional tools, too. Though I do recall somebody around here has
grumbled once or twice about OpenVMS (dis)integration with IP. OpenVMS
development had its head... elsewhere and too-long-preferred DECnet and
OSI and for way too many years, and the fallout from that era continues
to plague us.
I remember that hoary old RSTS/E had pseudo-keyboards accessible from (privileged) userland processes, odd that VMS never did. Dave Cutler didn’t want to sully his brainchild with such an “inefficient” concept?

Also I remember looking into how SET HOST (the DECnet equivalent of Telnet) worked: once the RTPAD program had set up the connection, all subsequent I/O transfers between the terminal driver and the network connection happened directly in kernel mode, with little or no further intervention from userland. No doubt efficient (at least looking at user-mode CPU usage), but needing a custom kernel-mode driver for each terminal-connection protocol limited your flexibility somewhat.
Robert A. Brooks
2021-11-01 15:08:59 UTC
Reply
Permalink
I'm pretty sure the DECnet Phase IV code is not broken.
🎯
Noted.
--
-- Rob
Dave Froble
2021-11-01 17:41:43 UTC
Reply
Permalink
Recently I installed VSI VMS V8.4 2L3 on an RX2660 system.
DECnet is not working.
I just checked the sources for NETACP.EXE and NETDRIVER.EXE, and there was a single change between V8.4-2L1 and V8.4-2L3.
It was a change that I made that fixed a problem in searching the binary trees in the volatile database
for node entries.
That code is only used for searching the node entries themselves, since node databases can hold up to 63*1024 entries
and needed some form of optimization.
The other volatile databases (line, circuit, object, etc...) hold far fewer entries, and are not implemented as binary
trees.
I'm pretty sure the DECnet Phase IV code is not broken.
--
-- Rob
I don't believe that DECnet is broken either. A clean installation works as expected. There is something in the upgrade process that breaks existing DECnet systems.
This is what was reported and is of concern.
Dan, that is what Rich reported, not what I reported. Mine was a ??clean??
install of 2L3. My problem has been solved, I think.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Loading...