Discussion:
Are the LAN VCI 2.0 plans ambitious enough?
(too old to reply)
Dirk Munk
2017-01-06 14:26:26 UTC
Permalink
Raw Message
I’ve seen the plans for LAN VCI 2.0, but I wonder if they are ambitious
enough. I’m mostly referring to the hardware offload capabilities that
are planned now. These are the plans VSI published:

Support for hardware offload features:
Send checksum offload (Send CKO).
Receive Checksum Offload (Recv CKO).
TCP Segmentation Offload (TSO).
Large Receive Offload (LRO).

We all know encrypted networking is the future, and encryption is
getting stronger, hence you need more calculating power to do
encryption/decryption. Modern high-end NIC’s have processors or
numerical co-processors to do these calculations, and you can get TLS
and IPsec offload.

That however is not in VSI plans, and I think it should be. Of course
you also need to adapt your drivers for that kind of work, and since VSI
is working on their new IP stack, in my view it would be a good thing if
they would take these off-load requirements into consideration.

If this kind of offload is not on the NIC, it will use a lot of
CPU-power, and most likely will negatively impair the network performance.

They are also thinking about the Precision Time Protocol (PTP), but for
that you need NIC’s that have a special circuit to intercept PTP packets
and send them straight to a special driver that will update the clock.

At the moment VSI is looking at two NIC’s and neither of them can do TLS
and IPsec offload or PTP.

So I think VSI should be a bit more ambitious with their VLAN VCI 2.0 plans.
David Froble
2017-01-06 21:21:33 UTC
Permalink
Raw Message
Post by Dirk Munk
I’ve seen the plans for LAN VCI 2.0, but I wonder if they are ambitious
enough. I’m mostly referring to the hardware offload capabilities that
Send checksum offload (Send CKO).
Receive Checksum Offload (Recv CKO).
TCP Segmentation Offload (TSO).
Large Receive Offload (LRO).
We all know encrypted networking is the future, and encryption is
getting stronger, hence you need more calculating power to do
encryption/decryption. Modern high-end NIC’s have processors or
numerical co-processors to do these calculations, and you can get TLS
and IPsec offload.
That however is not in VSI plans, and I think it should be. Of course
you also need to adapt your drivers for that kind of work, and since VSI
is working on their new IP stack, in my view it would be a good thing if
they would take these off-load requirements into consideration.
If this kind of offload is not on the NIC, it will use a lot of
CPU-power, and most likely will negatively impair the network performance.
They are also thinking about the Precision Time Protocol (PTP), but for
that you need NIC’s that have a special circuit to intercept PTP packets
and send them straight to a special driver that will update the clock.
At the moment VSI is looking at two NIC’s and neither of them can do TLS
and IPsec offload or PTP.
So I think VSI should be a bit more ambitious with their VLAN VCI 2.0 plans.
AMD started building chips with both multiple CPUs, and graphics CPUs. Lots of
compute power there. Maybe use it for more than just pretty pictures on a monitor?

Dedicated CPUs / GPUs for special purposes is a great idea. I'm thinking the
devil is in the details. Need lots of standards, which can be good and bad.
While VSI isn't in the HW business, on-chip GPUs might be a way around that.
Dirk Munk
2017-01-06 21:52:22 UTC
Permalink
Raw Message
Post by David Froble
Post by Dirk Munk
I’ve seen the plans for LAN VCI 2.0, but I wonder if they are
ambitious enough. I’m mostly referring to the hardware offload
Send checksum offload (Send CKO).
Receive Checksum Offload (Recv CKO).
TCP Segmentation Offload (TSO).
Large Receive Offload (LRO).
We all know encrypted networking is the future, and encryption is
getting stronger, hence you need more calculating power to do
encryption/decryption. Modern high-end NIC’s have processors or
numerical co-processors to do these calculations, and you can get TLS
and IPsec offload.
That however is not in VSI plans, and I think it should be. Of course
you also need to adapt your drivers for that kind of work, and since
VSI is working on their new IP stack, in my view it would be a good
thing if they would take these off-load requirements into consideration.
If this kind of offload is not on the NIC, it will use a lot of
CPU-power, and most likely will negatively impair the network
performance.
They are also thinking about the Precision Time Protocol (PTP), but
for that you need NIC’s that have a special circuit to intercept PTP
packets and send them straight to a special driver that will update
the clock.
At the moment VSI is looking at two NIC’s and neither of them can do
TLS and IPsec offload or PTP.
So I think VSI should be a bit more ambitious with their VLAN VCI 2.0 plans.
AMD started building chips with both multiple CPUs, and graphics CPUs.
Lots of compute power there. Maybe use it for more than just pretty
pictures on a monitor?
Dedicated CPUs / GPUs for special purposes is a great idea. I'm
thinking the devil is in the details. Need lots of standards, which can
be good and bad. While VSI isn't in the HW business, on-chip GPUs might
be a way around that.
GPU's are not on server processors, just on desktop and mobile processors.

Suppose you have a server motherboard with four sockets, that would mean
you would also have four GPU's. Doesn't make sense.
IanD
2017-01-10 06:36:58 UTC
Permalink
Raw Message
Will the new AMD server CPUs (Zen) have additional security number crunching abilities? I heard they will but I could be just daydreaming that...

I believe the top end AMD server CPU will have 32 cores. VMS could lock the security crunching abilities to a core or two and have plenty of room to spare

I think I/O is going to become the bottleneck of systems again, at least until we see all memory machines grace us, like HP's The Machine, if it makes it before HP is bought/sold
Dirk Munk
2017-01-10 11:31:05 UTC
Permalink
Raw Message
Post by IanD
Will the new AMD server CPUs (Zen) have additional security number crunching abilities? I heard they will but I could be just daydreaming that...
I believe the top end AMD server CPU will have 32 cores. VMS could lock the security crunching abilities to a core or two and have plenty of room to spare
I think I/O is going to become the bottleneck of systems again, at least until we see all memory machines grace us, like HP's The Machine, if it makes it before HP is bought/sold
Yes, there is a problem with IO.

Let's take a 100Gb Ethernet controller. You need a x16 PCIe 3.0 slot to
use such a NIC!

Server CPU's may also have circuits to directly connect Ethernet
interfaces. That would be the best solution, PTP could update the clock
of the NIC, and the CPU can read that clock directly. All the
encryption/decryption circuits can also be incorporated.
Stephen Hoffman
2017-01-10 18:31:46 UTC
Permalink
Raw Message
Post by Dirk Munk
Post by IanD
I think I/O is going to become the bottleneck of systems again, at
least until we see all memory machines grace us, like HP's The Machine,
if it makes it before HP is bought/sold
Yes, there is a problem with IO.
Let's take a 100Gb Ethernet controller. You need a x16 PCIe 3.0 slot to
use such a NIC!
You need an OS fast enough to deal with all that, too.

Which usually means offloading of some ilk either to the NIC or to a
core, though that adds complexity and potentially NIC-specific hardware
dependencies...

https://lwn.net/Articles/629155/
https://groups.google.com/d/msg/comp.os.vms/Bi50ZZlHDz0/IOdbGk1nHDQJ
--
Pure Personal Opinion | HoffmanLabs LLC
Dirk Munk
2017-01-11 08:59:26 UTC
Permalink
Raw Message
Post by Stephen Hoffman
Post by Dirk Munk
Post by IanD
I think I/O is going to become the bottleneck of systems again, at
least until we see all memory machines grace us, like HP's The
Machine, if it makes it before HP is bought/sold
Yes, there is a problem with IO.
Let's take a 100Gb Ethernet controller. You need a x16 PCIe 3.0 slot
to use such a NIC!
You need an OS fast enough to deal with all that, too.
Which usually means offloading of some ilk either to the NIC or to a
core, though that adds complexity and potentially NIC-specific hardware
dependencies...
https://lwn.net/Articles/629155/
https://groups.google.com/d/msg/comp.os.vms/Bi50ZZlHDz0/IOdbGk1nHDQJ
I'm fully aware of that problem, that is why I started this thread. If
the NIC can deal with TCP, TLS, IPsec and iSCSI related processing in
hardware, that can greatly reduce the work load on the CPU.

Dedicating one CPU core to a NIC, or even two (transmit & receive
traffic), could also help.

My point is that the new LAN drivers and the new IP stack should be
written in such a way that this kind of offload is possible with a
suitable NIC, and I don't see that in the plans.
David Froble
2017-01-11 17:14:24 UTC
Permalink
Raw Message
Post by Dirk Munk
Post by Stephen Hoffman
Post by Dirk Munk
Post by IanD
I think I/O is going to become the bottleneck of systems again, at
least until we see all memory machines grace us, like HP's The
Machine, if it makes it before HP is bought/sold
Yes, there is a problem with IO.
Let's take a 100Gb Ethernet controller. You need a x16 PCIe 3.0 slot
to use such a NIC!
You need an OS fast enough to deal with all that, too.
Which usually means offloading of some ilk either to the NIC or to a
core, though that adds complexity and potentially NIC-specific hardware
dependencies...
https://lwn.net/Articles/629155/
https://groups.google.com/d/msg/comp.os.vms/Bi50ZZlHDz0/IOdbGk1nHDQJ
I'm fully aware of that problem, that is why I started this thread. If
the NIC can deal with TCP, TLS, IPsec and iSCSI related processing in
hardware, that can greatly reduce the work load on the CPU.
Dedicating one CPU core to a NIC, or even two (transmit & receive
traffic), could also help.
My point is that the new LAN drivers and the new IP stack should be
written in such a way that this kind of offload is possible with a
suitable NIC, and I don't see that in the plans.
Not sure what the advantage is to lock a core or two for some specific purpose.
As long as there is sufficient cycles, why not just let the scheduler assign
tasks as required? And if there is not sufficient cycles, what is the need of
faster network when the rest of the processing cannot keep up with it?

Is this a solution looking for a problem?
Stephen Hoffman
2017-01-11 18:02:36 UTC
Permalink
Raw Message
Post by David Froble
Not sure what the advantage is to lock a core or two for some specific
purpose. As long as there is sufficient cycles, why not just let the
scheduler assign tasks as required? And if there is not sufficient
cycles, what is the need of faster network when the rest of the
processing cannot keep up with it?
At speed, caching behaviors can thrash your performance. The process
scheduler tries to avoid dumping the caches unnecessariy, but things
don't always work out as some applications want or need. For some
applications, the lock manager performance can be improved by
dedicating a core. Same with dedicating cores for some apps, too.
Post by David Froble
Is this a solution looking for a problem?
Shoveling bits through a modern NIC at something approaching line
speeds is an increasing problem. See the linked articles for
background.

Unfortunately, the whiteboards and the features-and-enhancements
database at Bolton are not ∞ infinite length.
--
Pure Personal Opinion | HoffmanLabs LLC
David Froble
2017-01-11 22:23:52 UTC
Permalink
Raw Message
Post by Stephen Hoffman
Post by David Froble
Not sure what the advantage is to lock a core or two for some specific
purpose. As long as there is sufficient cycles, why not just let the
scheduler assign tasks as required? And if there is not sufficient
cycles, what is the need of faster network when the rest of the
processing cannot keep up with it?
At speed, caching behaviors can thrash your performance. The process
scheduler tries to avoid dumping the caches unnecessariy, but things
don't always work out as some applications want or need. For some
applications, the lock manager performance can be improved by dedicating
a core. Same with dedicating cores for some apps, too.
Post by David Froble
Is this a solution looking for a problem?
Shoveling bits through a modern NIC at something approaching line speeds
is an increasing problem. See the linked articles for background.
Unfortunately, the whiteboards and the features-and-enhancements
database at Bolton are not ∞ infinite length.
Well, if there is sufficient cycles, then perhaps a scheduler would keep a few
CPUs on a particular task. However, while it may be efficient for a particular
task, if the system cannot keep up with that task, then perhaps that task might
not have so much work to do.

I do understand how something like that could be more efficient for the
networking work, but I've got to wonder what a system could be doing to need
such network capability? Maybe streaming movies over the internet?
Stephen Hoffman
2017-01-11 23:26:12 UTC
Permalink
Raw Message
Post by David Froble
Post by Stephen Hoffman
Post by David Froble
Not sure what the advantage is to lock a core or two for some specific
purpose. As long as there is sufficient cycles, why not just let the
scheduler assign tasks as required? And if there is not sufficient
cycles, what is the need of faster network when the rest of the
processing cannot keep up with it?
At speed, caching behaviors can thrash your performance. The process
scheduler tries to avoid dumping the caches unnecessariy, but things
don't always work out as some applications want or need. For some
applications, the lock manager performance can be improved by
dedicating a core. Same with dedicating cores for some apps, too.
Post by David Froble
Is this a solution looking for a problem?
Shoveling bits through a modern NIC at something approaching line
speeds is an increasing problem. See the linked articles for
background.
Unfortunately, the whiteboards and the features-and-enhancements
database at Bolton are not ∞ infinite length.
Well, if there is sufficient cycles, then perhaps a scheduler would
keep a few CPUs on a particular task.
The OpenVMS process scheduler already tries to do that; to keep
processes on the same cores, avoiding having to flush and reload
cashes. Hyperthreads on Itanium is related here, as — on older
Itanium processors — it's closer to a fast context switch. Things are
a bit better there on the most recent Itanium processors.

http://h41379.www4.hpe.com/doc/84final/ba322_90087/apbs04.html

See the integrated class scheduler support in OpenVMS, if you want to
customize the allocations and related behaviors for your own processes.

http://h41379.www4.hpe.com/openvms/journal/v15/class_schedule.html
Post by David Froble
However, while it may be efficient for a particular task, if the system
cannot keep up with that task, then perhaps that task might not have so
much work to do.
As NIC speeds increase, the devices can produce interrupts —
notifications of activity that the software needs to process — faster
than the software can field them. That's what the LWN article is
discussing; it's at a level "below" that of the process (and KP thread)
scheduling; not something occurring in process context. That's that
the device drivers and I/O routines are having "fun" running that fast,
then they have to fan that data out.

https://lwn.net/Articles/629155/

This ties back into what started this thread, which were attempts to
offload some of the overhead onto a device. This akin to what DMA does
for copying data around, or what CI adapters — some of these CI
adapters are very complex widgets — could do with CI communications
without involving the host processor, including controller-initiated
DMA operations into host memory in at least some cases. NICs can now
provide something similar here, offloading parts of network processing
into the adapter.

https://en.wikipedia.org/wiki/TCP_offload_engine
Post by David Froble
I do understand how something like that could be more efficient for the
networking work, but I've got to wonder what a system could be doing to
need such network capability? Maybe streaming movies over the internet?
Receiving large quantities of data from {expurgated} is not a
particularly obscure application on some of the OpenVMS systems in use.
First example I'd encountered years ago was accepting data from
{expurgated} arriving at a rate just under the SBI (backplane)
bandwidth speed into a VAX-11/785 box, and I've worked with much more
recent examples on much newer and much higher-end hardware. Getting
that data reduced and written out — or just streamed to other servers —
for additional or subsequent processing or archival processing were
interesting problems in their own accord. I'm aware of sites now that
have substantial cluster configurations reading and processing such
feeds. These usually with very large bandwidth and very low latency
requirements, and often with very big budgets, too.

Many implementations of video streaming are certainly not lightweight,
though that's also not a sort of application I've encountered on
OpenVMS. It can certainly work, it's just not something I've
encountered on OpenVMS servers. Nor software for it, for that matter.
Media streaming applications also tend to be distributed across
multiple servers and data centers and across geographies, and using
cheaper boxes can work effectively for that. Or you use one or more
content delivery networks, and outsource parts of the effort.

https://www.akamai.com/us/en/resources/video-streaming-server.jsp
https://www.adobe.com/products/adobe-media-server-family.html
--
Pure Personal Opinion | HoffmanLabs LLC
Kerry Main
2017-01-14 19:21:12 UTC
Permalink
Raw Message
-----Original Message-----
Stephen Hoffman via Info-vax
Sent: January 11, 2017 6:26 PM
Subject: Re: [Info-vax] Are the LAN VCI 2.0 plans ambitious enough?
Post by David Froble
Post by Stephen Hoffman
Post by David Froble
Not sure what the advantage is to lock a core or two for some
specific
Post by David Froble
Post by Stephen Hoffman
Post by David Froble
purpose. As long as there is sufficient cycles, why not just let the
scheduler assign tasks as required? And if there is not sufficient
cycles, what is the need of faster network when the rest of the
processing cannot keep up with it?
At speed, caching behaviors can thrash your performance. The
process
Post by David Froble
Post by Stephen Hoffman
scheduler tries to avoid dumping the caches unnecessariy, but
things
Post by David Froble
Post by Stephen Hoffman
don't always work out as some applications want or need. For
some
Post by David Froble
Post by Stephen Hoffman
applications, the lock manager performance can be improved by
dedicating a core. Same with dedicating cores for some apps, too.
Post by David Froble
Is this a solution looking for a problem?
Shoveling bits through a modern NIC at something approaching line
speeds is an increasing problem. See the linked articles for
background.
Unfortunately, the whiteboards and the features-and-
enhancements
Post by David Froble
Post by Stephen Hoffman
database at Bolton are not ∞ infinite length.
Well, if there is sufficient cycles, then perhaps a scheduler would
keep a few CPUs on a particular task.
The OpenVMS process scheduler already tries to do that; to keep
processes on the same cores, avoiding having to flush and reload
cashes. Hyperthreads on Itanium is related here, as — on older
Itanium processors — it's closer to a fast context switch. Things are
a bit better there on the most recent Itanium processors.
http://h41379.www4.hpe.com/doc/84final/ba322_90087/apbs04.html
See the integrated class scheduler support in OpenVMS, if you want to
customize the allocations and related behaviors for your own
processes.
http://h41379.www4.hpe.com/openvms/journal/v15/class_schedule.
html
Post by David Froble
However, while it may be efficient for a particular task, if the
system cannot keep up with that task, then perhaps that task might
not
Post by David Froble
have so much work to do.
As NIC speeds increase, the devices can produce interrupts —
notifications of activity that the software needs to process — faster
than the software can field them. That's what the LWN article is
discussing; it's at a level "below" that of the process (and KP thread)
scheduling; not something occurring in process context. That's that
the device drivers and I/O routines are having "fun" running that fast,
then they have to fan that data out.
https://lwn.net/Articles/629155/
This ties back into what started this thread, which were attempts to
offload some of the overhead onto a device. This akin to what DMA
does for copying data around, or what CI adapters — some of these CI
adapters are very complex widgets — could do with CI
communications without involving the host processor, including
controller-initiated
DMA operations into host memory in at least some cases. NICs can now
provide something similar here, offloading parts of network processing
into the adapter.
https://en.wikipedia.org/wiki/TCP_offload_engine
Post by David Froble
I do understand how something like that could be more efficient for
the networking work, but I've got to wonder what a system could be
doing to need such network capability? Maybe streaming movies
over the internet?
Receiving large quantities of data from {expurgated} is not a
particularly obscure application on some of the OpenVMS systems in use.
First example I'd encountered years ago was accepting data from
{expurgated} arriving at a rate just under the SBI (backplane)
bandwidth speed into a VAX-11/785 box, and I've worked with much
more recent examples on much newer and much higher-end
hardware. Getting that data reduced and written out — or just
streamed to other servers — for additional or subsequent processing
or archival processing were
interesting problems in their own accord. I'm aware of sites now that
have substantial cluster configurations reading and processing such
feeds. These usually with very large bandwidth and very low latency
requirements, and often with very big budgets, too.
Many implementations of video streaming are certainly not
lightweight, though that's also not a sort of application I've
encountered on
OpenVMS. It can certainly work, it's just not something I've
encountered on OpenVMS servers. Nor software for it, for that matter.
Media streaming applications also tend to be distributed across
multiple servers and data centers and across geographies, and using
cheaper boxes can work effectively for that. Or you use one or more
content delivery networks, and outsource parts of the effort.
https://www.akamai.com/us/en/resources/video-streaming-
server.jsp
https://www.adobe.com/products/adobe-media-server-family.html
On a similar vein - one should never under-estimate the competition and the same goes for mainframes.

Its interesting to keep current with what the mainframe folks are up to - especially with the industry wide heavy DC consolidation and "cloud" designs evolving.

Check out IBM's newest Z13 mainframe capabilities (hint: 8,000 VM's, dedicated security cryptographic co-processors, huge numbers of IO controllers (each with Power chips) that offload from the primary cores etc. on a single mainframe)





Regards,

Kerry Main
Kerry dot main at starkgaming dot com
Bob Koehler
2017-01-12 14:17:27 UTC
Permalink
Raw Message
Post by David Froble
Well, if there is sufficient cycles, then perhaps a scheduler would keep a few
CPUs on a particular task. However, while it may be efficient for a particular
task, if the system cannot keep up with that task, then perhaps that task might
not have so much work to do.
Which is fine for an ACP working the network stack, but not for a
driver. VMS scheduling can ony controll processes, not
interrupt-driven kernel code. Those interrupts are at a higher
priority than the one which drives the scheduler.
Dirk Munk
2017-01-11 20:05:12 UTC
Permalink
Raw Message
Post by David Froble
Post by Dirk Munk
Post by Stephen Hoffman
Post by Dirk Munk
Post by IanD
I think I/O is going to become the bottleneck of systems again, at
least until we see all memory machines grace us, like HP's The
Machine, if it makes it before HP is bought/sold
Yes, there is a problem with IO.
Let's take a 100Gb Ethernet controller. You need a x16 PCIe 3.0 slot
to use such a NIC!
You need an OS fast enough to deal with all that, too.
Which usually means offloading of some ilk either to the NIC or to a
core, though that adds complexity and potentially NIC-specific hardware
dependencies...
https://lwn.net/Articles/629155/
https://groups.google.com/d/msg/comp.os.vms/Bi50ZZlHDz0/IOdbGk1nHDQJ
I'm fully aware of that problem, that is why I started this thread. If
the NIC can deal with TCP, TLS, IPsec and iSCSI related processing in
hardware, that can greatly reduce the work load on the CPU.
Dedicating one CPU core to a NIC, or even two (transmit & receive
traffic), could also help.
My point is that the new LAN drivers and the new IP stack should be
written in such a way that this kind of offload is possible with a
suitable NIC, and I don't see that in the plans.
Not sure what the advantage is to lock a core or two for some specific
purpose. As long as there is sufficient cycles, why not just let the
scheduler assign tasks as required? And if there is not sufficient
cycles, what is the need of faster network when the rest of the
processing cannot keep up with it?
Is this a solution looking for a problem?
A 100 Gb NIC can receive and send about 10GB per second, so a total
dataflow of 20GB/sec, taht is very much.

Modern XEON CPU's have many cores, if a system indeed has such an
enormous dataflow, then there's nothing wrong with allocating two of the
Post by David Froble
24 cores of a CPU to a NIC, and certainly not if it makes networking
more efficient.
Michael Moroney
2017-01-11 20:17:25 UTC
Permalink
Raw Message
Post by David Froble
Post by Dirk Munk
I'm fully aware of that problem, that is why I started this thread. If
the NIC can deal with TCP, TLS, IPsec and iSCSI related processing in
hardware, that can greatly reduce the work load on the CPU.
Dedicating one CPU core to a NIC, or even two (transmit & receive
traffic), could also help.
My point is that the new LAN drivers and the new IP stack should be
written in such a way that this kind of offload is possible with a
suitable NIC, and I don't see that in the plans.
Not sure what the advantage is to lock a core or two for some specific purpose.
As long as there is sufficient cycles, why not just let the scheduler assign
tasks as required? And if there is not sufficient cycles, what is the need of
faster network when the rest of the processing cannot keep up with it?
Is this a solution looking for a problem?
VMS currently has Fastpath Preferred CPU selection for certain device drivers,
including the Fibrechannel, LAN and PE drivers. I don't really know how it all
works other than the idea is to keep cache consistent and to avoid thrashing/
interprocessor communication. I believe the lock manager can be assigned to a
particular CPU in the same way as well.
Dirk Munk
2017-01-11 20:27:41 UTC
Permalink
Raw Message
Post by Michael Moroney
Post by David Froble
Post by Dirk Munk
I'm fully aware of that problem, that is why I started this thread. If
the NIC can deal with TCP, TLS, IPsec and iSCSI related processing in
hardware, that can greatly reduce the work load on the CPU.
Dedicating one CPU core to a NIC, or even two (transmit & receive
traffic), could also help.
My point is that the new LAN drivers and the new IP stack should be
written in such a way that this kind of offload is possible with a
suitable NIC, and I don't see that in the plans.
Not sure what the advantage is to lock a core or two for some specific purpose.
As long as there is sufficient cycles, why not just let the scheduler assign
tasks as required? And if there is not sufficient cycles, what is the need of
faster network when the rest of the processing cannot keep up with it?
Is this a solution looking for a problem?
VMS currently has Fastpath Preferred CPU selection for certain device drivers,
including the Fibrechannel, LAN and PE drivers. I don't really know how it all
works other than the idea is to keep cache consistent and to avoid thrashing/
interprocessor communication. I believe the lock manager can be assigned to a
particular CPU in the same way as well.
Thanks Michael,

Do you have any other remarks on what I wrote?
Stephen Hoffman
2017-01-11 20:57:45 UTC
Permalink
Raw Message
Post by Michael Moroney
VMS currently has Fastpath Preferred CPU selection for certain device
drivers, including the Fibrechannel, LAN and PE drivers. I don't really
know how it all works other than the idea is to keep cache consistent
and to avoid thrashing/ interprocessor communication.
Ayup.

Details:

http://h41379.www4.hpe.com/doc/82final/5841/5841pro_069.html

Among the more confusing feature names used in OpenVMS, Fast Path and
Fast I/O are always entertaining to watch folks sort out. But I
digress.
Post by Michael Moroney
I believe the lock manager can be assigned to a particular CPU in the
same way as well.
Ayup.

Details:

http://h41379.www4.hpe.com/doc/731final/5841/5841pro_020.html#dedicated_cpu_lockmgr


Pointers to technical journal articles and such are left to the reader.

The whole of I/O system is overdue for a look, though. That'll become
much more interesting as faster NICs and faster storage becomes
ubiquitous for OpenVMS installations, and also if new and
"layer-crossing" development work akin to what ZFS provides is
considered, too. That whether part of rewriting and re-architecting
the shadow driver, or otherwise.
--
Pure Personal Opinion | HoffmanLabs LLC
Craig A. Berry
2017-01-12 01:05:55 UTC
Permalink
Raw Message
Post by Stephen Hoffman
Post by Michael Moroney
VMS currently has Fastpath Preferred CPU selection for certain device
drivers, including the Fibrechannel, LAN and PE drivers. I don't
really know how it all works other than the idea is to keep cache
consistent and to avoid thrashing/ interprocessor communication.
Ayup.
http://h41379.www4.hpe.com/doc/82final/5841/5841pro_069.html
Among the more confusing feature names used in OpenVMS, Fast Path and
Fast I/O are always entertaining to watch folks sort out. But I digress.
Post by Michael Moroney
I believe the lock manager can be assigned to a particular CPU in the
same way as well.
Ayup.
http://h41379.www4.hpe.com/doc/731final/5841/5841pro_020.html#dedicated_cpu_lockmgr
I don't think anyone has yet mentioned the Packet Processing Engine
(PPE) in the current thread:

<http://h41379.www4.hpe.com/doc/84final/tcprn/tcp_rnpro.html#ppe>

This isn't one CPU per NIC; it's one dedicated CPU per network stack,
but with VCI 1.x that may be as good as you can do anyway.

Oddly, they say it's "modeled on the OpenVMS Dedicated Lock Manager."
Which is either a typo, or it's what happens to the Distributed Lock
Manager when it gets it's own CPU. s/Lock/Limerick/ as appropriate.
Dirk Munk
2017-01-12 13:32:56 UTC
Permalink
Raw Message
Post by Craig A. Berry
Post by Stephen Hoffman
Post by Michael Moroney
VMS currently has Fastpath Preferred CPU selection for certain device
drivers, including the Fibrechannel, LAN and PE drivers. I don't
really know how it all works other than the idea is to keep cache
consistent and to avoid thrashing/ interprocessor communication.
Ayup.
http://h41379.www4.hpe.com/doc/82final/5841/5841pro_069.html
Among the more confusing feature names used in OpenVMS, Fast Path and
Fast I/O are always entertaining to watch folks sort out. But I digress.
Post by Michael Moroney
I believe the lock manager can be assigned to a particular CPU in the
same way as well.
Ayup.
http://h41379.www4.hpe.com/doc/731final/5841/5841pro_020.html#dedicated_cpu_lockmgr
I don't think anyone has yet mentioned the Packet Processing Engine
<http://h41379.www4.hpe.com/doc/84final/tcprn/tcp_rnpro.html#ppe>
This isn't one CPU per NIC; it's one dedicated CPU per network stack,
but with VCI 1.x that may be as good as you can do anyway.
Oddly, they say it's "modeled on the OpenVMS Dedicated Lock Manager."
Which is either a typo, or it's what happens to the Distributed Lock
Manager when it gets it's own CPU. s/Lock/Limerick/ as appropriate.
Excellent find Craig, thanks.
Michael Moroney
2017-01-12 15:34:26 UTC
Permalink
Raw Message
Post by Craig A. Berry
Oddly, they say it's "modeled on the OpenVMS Dedicated Lock Manager."
Which is either a typo, or it's what happens to the Distributed Lock
Manager when it gets it's own CPU. s/Lock/Limerick/ as appropriate.
I wonder if the author got confused "how could it be 'distributed' if
it's running on its own CPU?" or something, and thought 'distributed'
was itself a typo.

I came up with an odd idea last night that may be useful some day.
Have a device driver or execlet that takes over a core completely.
The core doesn't run VMS and is seen as unavailable to VMS or the
scheduler. Instead, the core runs "microcode" of some sort, hopefully
faster/more efficiently than normal VMS system code would.
Dirk Munk
2017-01-12 17:19:10 UTC
Permalink
Raw Message
Post by Michael Moroney
Post by Craig A. Berry
Oddly, they say it's "modeled on the OpenVMS Dedicated Lock Manager."
Which is either a typo, or it's what happens to the Distributed Lock
Manager when it gets it's own CPU. s/Lock/Limerick/ as appropriate.
I wonder if the author got confused "how could it be 'distributed' if
it's running on its own CPU?" or something, and thought 'distributed'
was itself a typo.
I came up with an odd idea last night that may be useful some day.
Have a device driver or execlet that takes over a core completely.
The core doesn't run VMS and is seen as unavailable to VMS or the
scheduler. Instead, the core runs "microcode" of some sort, hopefully
faster/more efficiently than normal VMS system code would.
I think it's a great idea. Latency is the biggest problem with high
performance networks and storage, and having dedicated cores will
certainly reduce latency,

Combine it with a seperate TCP/IP core, and you may get a system that is
really tuned for speed.
Kerry Main
2017-01-13 03:31:43 UTC
Permalink
Raw Message
-----Original Message-----
Dirk Munk via Info-vax
Sent: January 12, 2017 12:19 PM
Subject: Re: [Info-vax] Are the LAN VCI 2.0 plans ambitious enough?
Post by Michael Moroney
Post by Craig A. Berry
Oddly, they say it's "modeled on the OpenVMS Dedicated Lock
Manager."
Post by Michael Moroney
Post by Craig A. Berry
Which is either a typo, or it's what happens to the Distributed Lock
Manager when it gets it's own CPU. s/Lock/Limerick/ as
appropriate.
Post by Michael Moroney
I wonder if the author got confused "how could it be 'distributed' if
it's running on its own CPU?" or something, and thought
'distributed'
Post by Michael Moroney
was itself a typo.
I came up with an odd idea last night that may be useful some day.
Have a device driver or execlet that takes over a core completely.
The core doesn't run VMS and is seen as unavailable to VMS or the
scheduler. Instead, the core runs "microcode" of some sort,
hopefully
Post by Michael Moroney
faster/more efficiently than normal VMS system code would.
I think it's a great idea. Latency is the biggest problem with high
performance networks and storage, and having dedicated cores will
certainly reduce latency,
Combine it with a seperate TCP/IP core, and you may get a system
that
is really tuned for speed.
Or even better yet for next gen DLM option, with RoCEv2 high
performance, ultra low latency drivers, bypass the network stack
altogether.



http://bit.ly/2imdiE2 (see slide 9 - transparent to existing
applications)

Original url will wrap:
https://www.openfabrics.org/images/eventpresos/workshops2014/DevWorksh
op/presos/Wednesday/pdf/02_RoCEv2forOFA.pdf


Regards,

Kerry Main
Kerry dot main at starkgaming dot com
Stephen Hoffman
2017-01-12 17:38:11 UTC
Permalink
Raw Message
Post by Michael Moroney
Post by Craig A. Berry
Oddly, they say it's "modeled on the OpenVMS Dedicated Lock Manager."
Which is either a typo, or it's what happens to the Distributed Lock
Manager when it gets it's own CPU. s/Lock/Limerick/ as appropriate.
I wonder if the author got confused "how could it be 'distributed' if
it's running on its own CPU?" or something, and thought 'distributed'
was itself a typo.
The DLM project was known as the dedicated lock manager, IIRC.
Post by Michael Moroney
I came up with an odd idea last night that may be useful some day. Have
a device driver or execlet that takes over a core completely. The core
doesn't run VMS and is seen as unavailable to VMS or the scheduler.
Instead, the core runs "microcode" of some sort, hopefully faster/more
efficiently than normal VMS system code would.
Carrying that thought thru into the processor hardware design results
in something akin to Intel SGX. Which does all that, and which also
seeks to protect the code against even kernel-mode shenanigans.
Though SGX isn't intended for use by driver code.

In OpenVMS antiquity, there'st the "Qualify" package, which was used to
determine when ASMP would work (better). That from the VAX-11/780
days, and other VAX configurations with secondaries up through the
availability of SMP at V5.0.

In the mid-term, there was Galaxy. Which could push a core out of the
current SMP configuration. Live. Which is basically what is
suggested here, though there's no other SMP system "receiving" the core
in the Galaxy. Reworking the old Galaxy code that did the STOP/CPU
equivalent — while still removing the CPU from what the scheduler has
available — is very close to what you're after.

But if you're headed this way, doing something that everybody can then
use — via the class scheduler API or otherwise — is the best approach.
DLM, IP network processing, driver fork processes, whatever.
--
Pure Personal Opinion | HoffmanLabs LLC
David Froble
2017-01-12 20:38:52 UTC
Permalink
Raw Message
Post by Michael Moroney
Post by Craig A. Berry
Oddly, they say it's "modeled on the OpenVMS Dedicated Lock Manager."
Which is either a typo, or it's what happens to the Distributed Lock
Manager when it gets it's own CPU. s/Lock/Limerick/ as appropriate.
I wonder if the author got confused "how could it be 'distributed' if
it's running on its own CPU?" or something, and thought 'distributed'
was itself a typo.
I came up with an odd idea last night that may be useful some day.
Have a device driver or execlet that takes over a core completely.
The core doesn't run VMS and is seen as unavailable to VMS or the
scheduler. Instead, the core runs "microcode" of some sort, hopefully
faster/more efficiently than normal VMS system code would.
I would not call that odd, I'd call that a good idea. Not sure what it might be
used for, but that doesn't detract from the idea. Might not, or might, lend
itself to general purpose stuff. Specialized stuff, quite likely.

Would this not be similar to things in the past, such as FP, vectors, and such,
but just another core on the chip this time?
d***@gmail.com
2017-01-12 15:47:58 UTC
Permalink
Raw Message
Post by Craig A. Berry
Oddly, they say it's "modeled on the OpenVMS Dedicated Lock Manager."
Which is either a typo, or it's what happens to the Distributed Lock
Manager when it gets it's own CPU. s/Lock/Limerick/ as appropriate.
The lock manager has an optional mode where all local locking operations are shuffled off to a Dedicated-CPU Lock Manager process. For certain classes of workload (and sufficiently high CPU counts) this method may allow higher local lock throughput by reducing contention on the lock manager spinlock and additional CPU memory cache benefits.

SYSGEN> HELP SYS_P LCKMGR_MODE

Sys_Parameters

LCKMGR_MODE

(Alpha and Integrity servers) The LCKMGR_MODE parameter controls
use of the Dedicated CPU Lock Manager. Setting LCKMGR_MODE to a
number greater than zero (0) indicates the number of CPUs that
must be active before the Dedicated CPU Lock Manager is turned
on.

The Dedicated CPU Lock Manager performs all locking operations
on a single dedicated CPU. This can improve system performance
on large SMP systems with high MP_Synch associated with the lock
manager.

If the number of active CPUs is greater than or equal to LCKMGR_
MODE, a LCKMGR_SERVER process is created to service locking
operations. This process runs at a real-time priority of 63 and
is always current.

In addition, if the number of active CPUs should ever be reduced
below the required threshold by either a STOP/CPU command or by
a CPU reassignment in a Galaxy configuration, the Dedicated CPU
Lock Manager automatically turns off within one second, and the
LCKMGR_SERVER is placed in a hibernate state. If the number of
active CPUs is increased, the LCKMGR_SERVER resumes servicing
locking operations.

Specify one of the following:

o Zero (0) indicates that the Dedicated CPU Lock Manager is off
(the default).

o A number greater than zero (0) indicates the number of CPUs
that must be active before the Dedicated CPU Lock Manager will
turn on.

When the Dedicated CPU Lock Manager is turned on, fast path
devices are not assigned to the CPU used by the Dedicated CPU
Lock Manager.

For more information about use of the Dedicated CPU Lock Manager,
see the OpenVMS Performance Management manual.

LCKMGR_MODE is a DYNAMIC parameter.
IanD
2017-01-13 06:53:15 UTC
Permalink
Raw Message
With a dedicated CPU to the lock manager there is one small annoying side effect

Mon system shows the lck manager process as the top resource user since it sits on 100% CPU usage

In time, maybe a tweak to that utility might be good I.e. /exclude=(process1, process2...)

As it stands, one can now no longer use that screen for a quick overview of who's the current system CPU hog

My wish is that VSI cut a deal with the custodians of perfmon and ship that as standard with VMS, it's a very nice product indeed and provides a one stop shop for monitoring VMS (T4 is OK for free but it's piecemeal)
David Froble
2017-01-13 16:14:29 UTC
Permalink
Raw Message
Post by IanD
With a dedicated CPU to the lock manager there is one small annoying side effect
Mon system shows the lck manager process as the top resource user since it sits on 100% CPU usage
In time, maybe a tweak to that utility might be good I.e. /exclude=(process1, process2...)
What? The person who advocates openness now wants to hide something?

:-)
Stephen Hoffman
2017-01-13 17:16:00 UTC
Permalink
Raw Message
Post by David Froble
Post by IanD
With a dedicated CPU to the lock manager there is one small annoying side effect
Mon system shows the lck manager process as the top resource user since
it sits on 100% CPU usage
In time, maybe a tweak to that utility might be good I.e.
/exclude=(process1, process2...)
What? The person who advocates openness now wants to hide something?
:-)
There's some benefit around using what little display area is presented
by MONITOR — a display design which ties back to the days of 80x24
monochrome hardwired terminals and that hasn't been substantially
revisited or rethought since then, but I digress — for a compute-bound
process that's core-locked? It's not like other processes — those
outside of whatever class or construct associated with the core — can
be scheduled on that core. It'd be nice to have cogent displays across
the tools, particularly if there is to be an API to allow a process or
clump of processes to be core-locked. Galaxy doesn't show core
activity in other partitions, and this case isn't far off that.
You're not going to see other processes scheduled on dedicated cores.
MONITOR doesn't have a particular display of core activity, for that
matter. Maybe MONITOR gets a generic process display, and a display
for each of the class-scheduled activity and the core-locked processes;
separate displays for the processors dedicated to specific groups and
activities. It wouldn't surprise me to see cases where some task
needed multiple processors dedicated. Pretty soon, we get Galaxy-like
capabilities, though with the "guest" process or even the guest
operating system — who knows what that process is doing? — slightly
better known to the OpenVMS process scheduler than what happens over in
another Galaxy instance. And with no firmware mods, BTW.
--
Pure Personal Opinion | HoffmanLabs LLC
Stephen Hoffman
2017-01-10 18:23:02 UTC
Permalink
Raw Message
Post by IanD
Will the new AMD server CPUs (Zen) have additional security number
crunching abilities?
This stuff?

https://www.bleepingcomputer.com/news/hardware/researchers-point-out-theoretical-security-flaws-in-amds-upcoming-zen-cpu/
Post by IanD
I heard they will but I could be just daydreaming that...
If "Security Number Crunching" is in reference to random number
generation, then OpenVMS doesn't commonly use cryptographic random
number generators in its present APIs. Intel provide random number
generation, as likely do various other processor vendors. Various
operating systems use what the hardware provides, as well as acquiring
and using other sources of entropy to generate random numbers.

Or if that "security number crunking" — err, sorry, my bad, "security
number crunching — is a reference to features akin to SGX — as the
above-cited URL certainly implies — Intel has that feature with various
shipping processors. Whether SGX is actually secure against attacks
will probably take a year or three to learn, too. Same applies for
Intel SEV.

Intel SGX:

https://software.intel.com/en-us/blogs/2013/09/26/protecting-application-secrets-with-intel-sgx
Post by IanD
I believe the top end AMD server CPU will have 32 cores.
When adding enough cores, every OpenVMS application eventually
encounters Amdahl's Law. Or MPSYCH. Or both. Call back when there's
better language support here to allow easier use of all these cores,
too. For the apps that need and can use all of those cores effectively.

https://en.wikipedia.org/wiki/Amdahl's_law
Post by IanD
VMS could lock the security crunching abilities to a core or two and
have plenty of room to spare
OpenVMS has to boot on x86-64 first, then get I/O and the rest of the
platform sorted out, then decide what new features are needed including
whether SGX or SEV or other features make any sense to develop and
support. Given the choice, I'd rather have a cryptographic pseudo
random number generator (CPRNG) available before SGX or SEV support,
though.
Post by IanD
I think I/O is going to become the bottleneck of systems again,
Again? It's still not anywhere near processor speeds, though
available non-volatile storage is certainly vastly better and faster
than the classic non-solid-state HDD storage. Off-chip speeds have
been the bottleneck for aeons, and that'll only continue to be the case
— until it's all on the SoC, or such. Whether that off-chip access is
called I/O or not?
Post by IanD
at least until we see all memory machines grace us, like HP's The
Machine, if it makes it before HP is bought/sold
We already have some of that storage available now, and it's what that
storage looks like and works and costs, as the prices continue to drop
and the production increases.

For those with the budget for it, HPE's The Machine only starts to
become relevant when technical specs and developer docs and seeds start
to become available. Until then, it's vaporware. Or whether we're
discussing AArch64 servers or RISC V configurations in a few years, for
that matter.
--
Pure Personal Opinion | HoffmanLabs LLC
Kerry Main
2017-01-14 17:00:58 UTC
Permalink
Raw Message
-----Original Message-----
Stephen Hoffman via Info-vax
Sent: January 10, 2017 1:23 PM
Subject: Re: [Info-vax] Are the LAN VCI 2.0 plans ambitious enough?
Post by IanD
Will the new AMD server CPUs (Zen) have additional security
number
Post by IanD
crunching abilities?
This stuff?
https://www.bleepingcomputer.com/news/hardware/researchers-
point-out-theoretical-security-flaws-in-amds-upcoming-zen-cpu/
[snip..]
Post by IanD
I believe the top end AMD server CPU will have 32 cores.
When adding enough cores, every OpenVMS application eventually
encounters Amdahl's Law. Or MPSYCH. Or both. Call back when
there's better language support here to allow easier use of all these
cores, too. For the apps that need and can use all of those cores
effectively.
Pure speculation, but imho, with modern CPU core technology, there is likely less than 10-20% (perhaps less) of applications that need more than 12-16 cores today. Most VMware VM's have 4 or less vcpu's.

Having stated this, and to your point, the next generation strategy is not to continue with improving server utilization via VM sprawl issues such as what VMware propagates, but rather with improving OpenVMS to be able to share applications and create separate environments WITHIN OpenVMS. I think Clair stated something like this earlier..

Some examples:
- $show users would only show the users logged on in a specific "group" that this user is in.
- $show system or $show memory would only show the resources assigned to that "group".
- galaxy like technology would allow dynamic sharing of cpu/memory resources, so if a group needed extra cpu or memory, it could be easily allocated or deallocated (dynamically, manually or via business rules).

You could still have domain wide Sys priv's that would allow traditional system wide access.

Yes, this is also where LDAP/Enterprise Directory would be a critical component of the overall next gen solution.
https://en.wikipedia.org/wiki/Amdahl's_law
Post by IanD
VMS could lock the security crunching abilities to a core or two and
have plenty of room to spare
OpenVMS has to boot on x86-64 first, then get I/O and the rest of the
platform sorted out, then decide what new features are needed
including whether SGX or SEV or other features make any sense to
develop and support. Given the choice, I'd rather have a cryptographic
pseudo random number generator (CPRNG) available before SGX or
SEV support, though.
Post by IanD
I think I/O is going to become the bottleneck of systems again,
Again? It's still not anywhere near processor speeds, though
available non-volatile storage is certainly vastly better and faster
than the classic non-solid-state HDD storage. Off-chip speeds have
been the bottleneck for aeons, and that'll only continue to be the case
— until it's all on the SoC, or such. Whether that off-chip access is
called I/O or not?
Post by IanD
at least until we see all memory machines grace us, like HP's The
Machine, if it makes it before HP is bought/sold
Intel/Micron are apparently going to be bringing 3D XPoint (pronounced crosspoint) non-volatile, very fast TB scale memory technologies out sometime this year.

Late 2016 article on 3D XPoint:
Shortened:
http://bit.ly/2iTM6Ok

Original:
http://www.digitaltrends.com/computing/intel-micron-3d-xpoint-new-details-emerge/

Another article on very fast and very large SSD technologies:

Seagate’s new 60TB SSD is world’s largest:
http://bit.ly/2aPJcLz

or,

http://arstechnica.com/gadgets/2016/08/seagate-unveils-60tb-ssd-the-worlds-largest-hard-drive/
Quote - "reach 1PB with only 17 drives"

In perspective, the decision to address the 2TB disk size limitation in OpenVMS looks like a pretty good decision.

:-)
We already have some of that storage available now, and it's what that
storage looks like and works and costs, as the prices continue to drop
and the production increases.
For those with the budget for it, HPE's The Machine only starts to
become relevant when technical specs and developer docs and seeds start
to become available. Until then, it's vaporware. Or whether we're
discussing AArch64 servers or RISC V configurations in a few years, for
that matter.
Agree - Lets forget HPE's The Machine .. its toast.

Martin Fink (brain child and internal promoter) is no longer even with HPE.

HPE will simply let the hype run down and eventually announce they are allocating The Thing resources to "related" future technologies that HPE is working on.

Fwiw, I suspect these internal design resources have already been reallocated to SGI related transformation technologies. HP bought SGI last year - likely after it had given up on The Thing.

Reference:
https://www.nextplatform.com/2016/11/29/hpe-takes-high-end-sgi-expertise/

Regards,

Kerry Main
Kerry dot main at starkgaming dot com
Stephen Hoffman
2017-01-17 23:25:32 UTC
Permalink
Raw Message
Post by Kerry Main
-----Original Message-----
Stephen Hoffman via Info-vax
Sent: January 10, 2017 1:23 PM
Subject: Re: [Info-vax] Are the LAN VCI 2.0 plans ambitious enough?
When adding enough cores, every OpenVMS application eventually
encounters Amdahl's Law. Or MPSYCH. Or both. Call back when there's
better language support here to allow easier use of all these cores,
too. For the apps that need and can use all of those cores effectively.
Pure speculation, but imho, with modern CPU core technology, there is
likely less than 10-20% (perhaps less) of applications that need more
than 12-16 cores today. Most VMware VM's have 4 or less vcpu's.
OpenVMS used to run into a pretty good performance wall around 8 cores,
though that's been creeping upward due to OpenVMS and application
changes. And it's all very much application-dependent, too.
Further up the core count, the overhead of maintaining a coherent cache
limits hardware performance. Amdahl and excessive MPSYNCH overhead,
etc., means adding cores doesn't add and variously reduces aggregate
performance. More than a few folks found AlphaServer ES4x boxes were
a sweet spot in the old Alpha line, in terms of price and performance.
Adding more cores provided diminishing returns, and the hardware and
license costs went up substantially. In the ProLiant product line,
HPE has stated ~80% of the folks use two socket servers, and more than
a few folks are looking at one- or two-socket blades or cartridges.
Density.

https://www.supermicro.com/products/MicroBlade/
https://www.supermicro.com/products/SuperBlade/
HPE's Moonshot, etc.
Post by Kerry Main
Having stated this, and to your point, the next generation strategy is
not to continue with improving server utilization via VM sprawl issues
such as what VMware propagates, but rather with improving OpenVMS to be
able to share applications and create separate environments WITHIN
OpenVMS. I think Clair stated something like this earlier..
- $show users would only show the users logged on in a specific
"group" that this user is in.- $show system or $show memory would only
show the resources assigned to that "group".
- galaxy like technology would allow dynamic sharing of cpu/memory
resources, so if a group needed extra cpu or memory, it could be easily
allocated or deallocated (dynamically, manually or via business rules).
You could still have domain wide Sys priv's that would allow
traditional system wide access.
Containers are fundamentally customers seeking to achieve a local
minimum of pricing given licensing costs.

Can't see most vendors charging for operating system licenses or
product licenses investing heavily in undercutting their own licensing
schemes and revenues, but stranger things have happened,

SGX attempts isolation here, not that there aren't discussions around
attestation and security of those constructs.

Then there's that OpenVMS has fundamentally no idea how to isolate
less-than-completely-trusted applications. It's increasingly unwise
to trust even the applications you've written, or services provided by
the vendors. It's all discretionary, and even resurrecting the old
mandatory access controls won't suffice. Here's something to
ruminate on:

https://blogs.technet.microsoft.com/mmpc/2017/01/13/hardening-windows-10-with-zero-day-exploit-mitigations/


All of these — containers, sandboxes, ASLR, etc — make attacks harder,
but not impossible. Not that OpenVMS hasn't already had malware and
attacks and even the occasional virus. There'll be much more of that
should the installed base increase or the value of the targets running
OpenVMS become interesting to attackers, too.
Post by Kerry Main
Yes, this is also where LDAP/Enterprise Directory would be a critical
component of the overall next gen solution.
LDAP is already a critical component. OpenVMS is a more than a decade
late to that particular party.
--
Pure Personal Opinion | HoffmanLabs LLC
Kerry Main
2017-01-18 03:06:12 UTC
Permalink
Raw Message
-----Original Message-----
Stephen Hoffman via Info-vax
Sent: January 17, 2017 6:26 PM
Subject: Re: [Info-vax] Are the LAN VCI 2.0 plans ambitious enough?
Post by Kerry Main
-----Original Message-----
Stephen Hoffman via Info-vax
Sent: January 10, 2017 1:23 PM
Subject: Re: [Info-vax] Are the LAN VCI 2.0 plans ambitious
enough?
Post by Kerry Main
When adding enough cores, every OpenVMS application eventually
encounters Amdahl's Law. Or MPSYCH. Or both. Call back when
there's better language support here to allow easier use of all these
cores, too. For the apps that need and can use all of those cores
effectively.
Post by Kerry Main
Pure speculation, but imho, with modern CPU core technology, there
is
Post by Kerry Main
likely less than 10-20% (perhaps less) of applications that need more
than 12-16 cores today. Most VMware VM's have 4 or less vcpu's.
OpenVMS used to run into a pretty good performance wall around 8
cores, though that's been creeping upward due to OpenVMS and
application
changes. And it's all very much application-dependent, too.
Further up the core count, the overhead of maintaining a coherent cache
limits hardware performance. Amdahl and excessive MPSYNCH
overhead,
etc., means adding cores doesn't add and variously reduces aggregate
performance. More than a few folks found AlphaServer ES4x boxes were
a sweet spot in the old Alpha line, in terms of price and performance.
One big reason why ES45's (ES45B to be specific) were so popular and why it took so long for Integrity servers to catch up to the ES45B's performance was that the ES45B servers had 1.25Ghz EV68 CPU's and each had 16MB of L2 cache on board.
Adding more cores provided diminishing returns, and the hardware and
license costs went up substantially. In the ProLiant product line,
HPE has stated ~80% of the folks use two socket servers, and more
than a few folks are looking at one- or two-socket blades or cartridges.
Density.
Part of the reason for smaller core servers is only partly due to improved performance. It is also draconian per core license charges from companies like Oracle, IBM and more recently Microsoft. Run VMware on a server and run Oracle on 1 or 2 small VM's still requires licensing Oracle for ALL of the cores on that server.
https://www.supermicro.com/products/MicroBlade/
https://www.supermicro.com/products/SuperBlade/
HPE's Moonshot, etc.
Post by Kerry Main
Having stated this, and to your point, the next generation strategy is
not to continue with improving server utilization via VM sprawl issues
such as what VMware propagates, but rather with improving
OpenVMS to
Post by Kerry Main
be able to share applications and create separate environments
WITHIN
Post by Kerry Main
OpenVMS. I think Clair stated something like this earlier..
- $show users would only show the users logged on in a specific
"group" that this user is in.- $show system or $show memory would
only
Post by Kerry Main
show the resources assigned to that "group".
- galaxy like technology would allow dynamic sharing of cpu/memory
resources, so if a group needed extra cpu or memory, it could be
easily allocated or deallocated (dynamically, manually or via business
rules).
Post by Kerry Main
You could still have domain wide Sys priv's that would allow
traditional system wide access.
Containers are fundamentally customers seeking to achieve a local
minimum of pricing given licensing costs.
Even with all of the associated issues with them, containers are fundamentally being driven by commodity OS Customers struggling big time with VM sprawl issues.
Can't see most vendors charging for operating system licenses or
product licenses investing heavily in undercutting their own licensing
schemes and revenues, but stranger things have happened,
Red Hat model of charging for support (OPEX - ongoing cash flows) and forgoing high up front license charges (CAPEX) that require C level approvals is the model for the future. In addition, there should be a major push to simplify the support license model with more bundled offerings. One only has to read the various web forums to hear how much Customers hate complicated and confusing licensing schemes.

Oracle, Microsoft, IBM, SAP etc are all in for some really tough times as Customers look to significantly reduce their IT SW costs with solutions they feel are "good enough".

This is the same trend that resulted in so many Customers jumping on commodity OS's like Linux.
SGX attempts isolation here, not that there aren't discussions around
attestation and security of those constructs.
Then there's that OpenVMS has fundamentally no idea how to isolate
less-than-completely-trusted applications. It's increasingly unwise
to trust even the applications you've written, or services provided by
the vendors. It's all discretionary, and even resurrecting the old
mandatory access controls won't suffice. Here's something to
https://blogs.technet.microsoft.com/mmpc/2017/01/13/hardening-
windows-10-with-zero-day-exploit-mitigations/
All of these — containers, sandboxes, ASLR, etc — make attacks harder,
but not impossible. Not that OpenVMS hasn't already had malware and
attacks and even the occasional virus. There'll be much more of that
should the installed base increase or the value of the targets running
OpenVMS become interesting to attackers, too.
While popularity does have some bearing on hacker interest, it is also about what is the payoff for the hacker + the time they invest in the hack + just how much knowledge the hacker has in the target platform.

Given OpenVMS's high value core areas of banks, stock exchanges, manufacturing, power utilities, lotteries etc. if there was so many security issues with OpenVMS as you like to promote, then why we have not heard of these exploits over the last 15+ years?

[yes, there have been a few OpenVMS security issues over the years, but no where near 20-30+ per month for every year going back decades]

Yes, Yes, there are security issues that need to be addressed in OpenVMS and new functional areas added, but I just do not buy the argument that a platform gets 20-30+ security issues highlighted EACH and EVERY month simply because it is more popular. That is rubbish.

That is like saying the only reason corner stores have more robberies than banks is because there are more corner stores than banks.

Engineering culture, architecture, history and small, creative Eng teams are imho, what makes for good platform products in the long term.
Post by Kerry Main
Yes, this is also where LDAP/Enterprise Directory would be a critical
component of the overall next gen solution.
LDAP is already a critical component. OpenVMS is a more than a decade
late to that particular party.
Agree improvements in LDAP are needed. Having stated this, LDAP is simply the first step in next generation cross platform security planning. On top of this, one needs to add layers of Identity Management (IdM). Thankfully, there are some 100% Java commercial solutions available that can sit on top of any V3 compliant LDAP solution.

http://www.idmworks.com/iam-integration-software/openvms-connector/

http://www.prweb.com/releases/prwebIdentityForge/OpenVMS/prweb9155858.htm


Regards,

Kerry Main
Kerry dot main at starkgaming dot com
Stephen Hoffman
2017-01-18 17:29:30 UTC
Permalink
Raw Message
Post by Kerry Main
-----Original Message-----
Stephen Hoffman via Info-vax
Sent: January 17, 2017 6:26 PM
Subject: Re: [Info-vax] Are the LAN VCI 2.0 plans ambitious enough?
OpenVMS used to run into a pretty good performance wall around 8 cores,
though that's been creeping upward due to OpenVMS and application
changes. And it's all very much application-dependent, too. Further
up the core count, the overhead of maintaining a coherent cache limits
hardware performance. Amdahl and excessive MPSYNCH
overhead, etc., means adding cores doesn't add and variously reduces
aggregate performance. More than a few folks found AlphaServer ES4x
boxes were a sweet spot in the old Alpha line, in terms of price and
performance.
One big reason why ES45's (ES45B to be specific) were so popular and
why it took so long for Integrity servers to catch up to the ES45B's
performance was that the ES45B servers had
I remember that a little differently. Integrity servers caught up
with the ES4x class fairly quickly. It was the EV7 class Alpha that
held the performance lead for quite a while, that due in no small part
to the limitations of the older FSB design used in the Itanium
processors of that earlier era; in the pre-QPI designs. The EV7 torus
was exceedingly fast, for its era. Still does pretty well, all things
considered. Though power and cooling costs and hardware and software
and support costs have led to the retirement of many of the Alpha boxes.
Post by Kerry Main
1.25Ghz EV68 CPU's and each had 16MB of L2 cache on board.
In isolation, clock speeds and cache sizes don't tell a particularly
useful story.
Post by Kerry Main
On EV68-class processors, there can be up to 16.0 MB of L2 cache
off-chip, while EV7 has a significantly smaller 1.75 MB of L2 cache
on-chip. In isolation, you might well assume that the far larger L2
cache on EV68 would be the better choice than the much smaller L2 of
EV7. Conversely, the bandwidth from main memory into L2 on EV7 is
significantly faster than on EV68 — if you have the bandwidth, the size
of the cache can be smaller. On-chip usually beats off-chip, due in no
small part to the improved interconnect speed from the locality — the
further from the processor, the slower the access. Cache being a
technique for contending with off-chip performance limitations, and if
you can use higher associatively and higher off-chip bandwidth and
faster main memory, you can use less cache. HP reports “The seven-way
set associative cache [of EV7] behaves like a 12-16 MB L2 cache [of
EV68].”
With EV68, the speeds and feeds are 9.6 GB/s to L1, 6.4 GB/s to L2, and
3.2 GB/s off-chip. There is one shared path out to memory and to I/O,
through the northbridge.
With EV7, 19.2 GB/s to L2 and 12.8 GB/s to local main memory through
the integrated RAMbus RDRAM memory controller; with the integrated
northbridge. Further, EV7 and its integrated northbridge has four
off-chip processor interconnects — usually labeled North, South, East
and West — with each having 6.4 GB/s speeds, and with an additional 6.4
GB/s port for I/O. Six interconnects. These interconnects are the basis
of memory, I/O and the multiprocessing torus.
EV7 had more bandwidth to main memory than EV68 had to L1 cache.

Cache gets added when the memory latency is larger and/or bandwidth is
lower. When the designer wants to avoid taking the slower path.
More cache is an indication of a mismatched design; where some
component — HDD, for instance — is much slower than some other
component — memory, I/O bus, main memory, processor — in the system
design. Except to the folks in marketing, core clock speed only
provides a comparison within a design. Not across processor designs.
And clock speeds get really interesting as core counts increase, but
thermals limit how many can be active and for how long, and whether
fewer cores can potentially be deliberately overclocked.

In the case of the ES4x classes, the applications ran faster on those
boxes up to four cores than then same applications ran on larger boxes,
or the incremental increase in performance was offset by the hardware
and software license costs, or was due to lower efficiency due to the
added processors.

Even in current-generation Intel x86-64 cores, more cores means lower
speeds, and throttling down cores allows a subset of the processor
cores to remain running longer or to sprint until thermals force
throttling. Adding cores inevitably makes the whole configuration
slower. Which means that lower-core count processors are faster for
fewer-stream application loads and the higher-core-count processors are
better at what used to be called batch-oriented loads.

The same trade-offs have held for OpenVMS all the way back to VAX and
ASMP. Some configurations are better at some tasks than others, and
there's always some point at which adding cores doesn't speed up
performance and often decreases it, both due to the parallelism in the
software and due to the added overhead of maintaining cache coherency
across the cores. VAX ASMP used the Qualify tool to determine whether
an application load would benefit from adding that second core. SMP
rounded off a number of the corners found in the ASMP design, but it's
all still whether the core count or some other factor is the
bottleneck. Adding more doesn't make things faster.
Post by Kerry Main
Adding more cores provided diminishing returns, and the hardware and
license costs went up substantially. In the ProLiant product line,
HPE has stated ~80% of the folks use two socket servers, and more
than a few folks are looking at one- or two-socket blades or cartridges.
Density.
Part of the reason for smaller core servers is only partly due to
improved performance.
Isn't that what I just wrote?
Post by Kerry Main
It is also draconian per core license charges from companies like
Oracle, IBM and more recently Microsoft. Run VMware on a server and run
Oracle on 1 or 2 small VM's still requires licensing Oracle for ALL of
the cores on that server.
Whether per-core or per-socket or per-box, it's the aggregate costs.
Post by Kerry Main
Containers are fundamentally customers seeking to achieve a local
minimum of pricing given licensing costs.
Even with all of the associated issues with them, containers are
fundamentally being driven by commodity OS Customers struggling big
time with VM sprawl issues.
Can't say I see much of a difference between tracking where and how a
container is running and tracking an image is running in a VM, in terms
of the struggling involved here. Containers are operating within a
guest within a virtual machine, and — to keep them isolated —
containers have their own IP addresses and other constructs intended to
keep them both isolated and to reduce resource conflicts, and VM guests
themselves have IP addresses for themselves including one for
management. So there's seemingly somewhat more tracking involved
here, as . Turtles all the way down.

And your faith in the trustworthiness of each individual application is
commendable.
Post by Kerry Main
Can't see most vendors charging for operating system licenses or
product licenses investing heavily in undercutting their own licensing
schemes and revenues, but stranger things have happened,
Red Hat model of charging for support (OPEX - ongoing cash flows) and
forgoing high up front license charges (CAPEX) that require C level
approvals is the model for the future.
I see that a little differently. RHEL acquires developers and
early-stage and prototype deployments with completely free offerings —
an affordable entry-level configurations — and then captures the folks
that subsequently want or need somebody to blame; that want support.

DEC tried various pricing models, as has Compaq and HP and now HPE,
even with OpenVMS. VSI has made some comments in this area, too.
Some of what's been tried includes capacity on demand, per-user
licenses, server-only licenses, and other programs. But there
seemingly hasn't been a competitive entry-level offering for OpenVMS in
quite a while; arguably not since the VAX era. Particularly not once
the market was moving from VAX to Unix boxes and then from whatever to
Windows boxes, and more recently to Linux servers.

Current prices for Alpha and for Itanium are not particularly conducive
to wholly new projects with wholly new deployments.
Post by Kerry Main
In addition, there should be a major push to simplify the support
license model with more bundled offerings. One only has to read the
various web forums to hear how much Customers hate complicated and
confusing licensing schemes.
Complicated product offerings and complicated product packages and
complicated user interfaces and complicated deployments and complicated
morasses of containers and VM guests and blades and the rest aren't
popular, either. There's real skill and real discipline in making
products that are or that appear less complex, and that are easier to
use, and easier to license, and products that are easier to support and
maintain, too. Implementing piles of logical names, and twisty little
configuration files — some different, some alike, some parameter
databases, some text files — just isn't on that path, either.
Post by Kerry Main
Oracle, Microsoft, IBM, SAP etc are all in for some really tough times
as Customers look to significantly reduce their IT SW costs with
solutions they feel are "good enough".
This is the same trend that resulted in so many Customers jumping on
commodity OS's like Linux.
Welcome to commoditization.
Post by Kerry Main
All of these — containers, sandboxes, ASLR, etc — make attacks harder,
but not impossible. Not that OpenVMS hasn't already had malware and
attacks and even the occasional virus. There'll be much more of that
should the installed base increase or the value of the targets running
OpenVMS become interesting to attackers, too.
While popularity does have some bearing on hacker interest, it is also
about what is the payoff for the hacker + the time they invest in the
hack + just how much knowledge the hacker has in the target platform.
There's more money to be made elsewhere for now and it's often easier
to phish somebody than other attacks, or it's just better to keep any
available information on the vulnerabilities quiet. Why go public
with a vulnerability and allow defenders to fix it, if you might want
to use that vulnerability later. Or to use the vulnerability again,
for that matter.

Making the effort larger on the attackers — sandboxes, ASLR,
no-execute, network encryption, distributed authentication, etc — all
serve to increase the costs and the difficulties for the attackers,
though added security also increases costs on the vendor to develop
those defenses, to maintain it al, and adds costs for the ISVs and
customers to configure and maintain and troubleshoot it all. To use
the defenses, for that matter, as traditional mandatory access controls
— or the security provided by sandboxes, for that matter — can be a
hassle to use or to comply with.
Post by Kerry Main
Given OpenVMS's high value core areas of banks, stock exchanges,
manufacturing, power utilities, lotteries etc. if there was so many
security issues with OpenVMS as you like to promote, then why we have
not heard of these exploits over the last 15+ years?
[yes, there have been a few OpenVMS security issues over the years, but
no where near 20-30+ per month for every year going back decades]
When the disclosure practices differ, any attempts at comparing CVE
counts is codswallop. At best. Much like comparing cache sizes and
processor clocks, but I digress. It only takes one exposure — quite
possibly not in OpenVMS itself, but in some printer that then allows
access into SCS which allows access into the password hash which is far
too fast to calculate leading to an exposure and down we go... In
that case, a printer vulnerability got remote entrance, and two known
vulnerabilities within OpenVMS — no CVEs exist for either the
far-too-fast Purdy polynomial password hash or for the unencrypted
cluster communications transport, BTW — then allowed access into
OpenVMS itself.

As for CVE counts? Where are the OpenVMS CVEs for Apache? For ISC
BIND? For the NTP server? For OpenSSL? For the LPs that were
updated when OpenSSL was updated? For SMH? Some have been issued,
but these patches often haven't been made available or haven't been
quickly available, and/or CVEs often haven't been requested and
assigned or referenced, etc. For problems that do apply to OpenVMS.

As for breaches? Many institutions are loath to discuss breaches.
What VSI and HPE might have encountered is not public. Over the last
decade, I'm aware of various breaches involving OpenVMS, though whether
HPE or VSI ever heard about those or about any others, I don't know.

I'd like to have no vulnerabilities. But we're not in an era where
there are no security holes. In OpenVMS, or otherwise. We are in
an era when we have to patch much more quickly. Which means... What
can we learn from how other platforms roll out their patches, and what
mechanisms and tools are available to make that easier? Because in
this case, a rate of 1 per year is entirely equivalent to a rate of
20-30+ per month, in terms of the damage that can be caused by a
vulnerability. And it's just as often the ones you don't know or
didn't go looking for, too. That's entirely before any contemplations
around differences in disclosure practices or CVE request practices,
and whether any of it can be correlated. Which means... how can I
reduce or isolate and identify risks, and how can I more quickly
mitigate breaches? I'd like to trust that OpenVMS doesn't have some
appreciable subset of that twenty to thirty holes a month, but I spend
more than a little time reading disclosure reports looking for cases
that might (or do, as I've variously found) apply to OpenVMS, too.
And I do expect there to be cases where I have to roll out OpenVMS
patches yesterday, if not sooner.
Post by Kerry Main
Yes, Yes, there are security issues that need to be addressed in
OpenVMS and new functional areas added, but I just do not buy the
argument that a platform gets 20-30+ security issues highlighted EACH
and EVERY month simply because it is more popular. That is rubbish.
Folks can point to Linux or Windows security problems here, and the
lesson we should all be learning includes the need to be notified of
and to push out patches much more quickly. To look for and identify
problems, for that matter. Aiming Kali at an OpenVMS box can be
entertaining, too. You'll almost certainly find some issues, in most
any non-trivial configuration.

Going years or months or weeks or increasingly days or hours is problematic.

Apache TLS was down-revision for years.

Has the vendor-provided SMTP mail server offered encrypted client
connections yet? (I'll leave off START_TLS, as that's fodder for
another debate or two.)
Post by Kerry Main
Engineering culture, architecture, history and small, creative Eng
teams are imho, what makes for good platform products in the long term.
More than a few engineering companies went out of business with that
model, too. Doing what the customer wants at a price they can
afford. VSI is working toward that and toward sustainable revenues,
and which means that there will be compromises across all areas,
including some involving security.
Post by Kerry Main
Post by Kerry Main
Yes, this is also where LDAP/Enterprise Directory would be a critical
component of the overall next gen solution.
LDAP is already a critical component. OpenVMS is a more than a decade
late to that particular party.
Agree improvements in LDAP are needed.
This is a decade or two of remediation, too; what will be a massive and
long-running project. This means dragging all the isolated
information stores forward in OpenVMS and layered products and
third-party packages and local software; of new APIs and particularly
of learning to play well with other servers in modern network
environments via Kerberos and LDAP, and of integrating TLS and DTLS as
well. This is far beyond the existing and fussy and hard-to-configure
and hard-to-troubleshoot LDAP external authentication password
integration presently available. Probably part of replacing the
existing SYSUAF design, which is itself long overdue for replacement.
--
Pure Personal Opinion | HoffmanLabs LLC
r***@gmail.com
2017-01-12 20:26:17 UTC
Permalink
Raw Message
Just to respond to the original comment about hardware offload capabilities:

The ones listed in the plans are what are currently supported by the NICs we are doing development on.

LAN VCI 2.0 is designed to allow future offload, encryption, whatever else to be added. We do plan to address these features, but keep in mind we are trying to get to x86 as soon as possible. We haven't mentioned the possible new devices to be supported, and we aren't worried about them yet. The focus is on getting the infrastructure in place and working with a few existing drivers that we know well. Then it is on to new devices and more functionality that we find available on these devices.
Dirk Munk
2017-01-13 13:24:13 UTC
Permalink
Raw Message
Post by r***@gmail.com
The ones listed in the plans are what are currently supported by the NICs we are doing development on.
LAN VCI 2.0 is designed to allow future offload, encryption, whatever else to be added. We do plan to address these features, but keep in mind we are trying to get to x86 as soon as possible. We haven't mentioned the possible new devices to be supported, and we aren't worried about them yet. The focus is on getting the infrastructure in place and working with a few existing drivers that we know well. Then it is on to new devices and more functionality that we find available on these devices.
Thanks for your reply.

It's good to know that more offload engines then the ones in the present
roadmap are taken into account for the architecture of the drivers.

Of course I understand that the x86 project is the most important thing,
any development for that should at least be done in such a way that
possibilities like adding more offload engines can easily be done later
on, and you have covered that.

With regard to PTP or also know as IEEE 1588, I'm sure you realise that
the present adapters on your roadmap don't support it. VMS always was
big in the financial world, and if I'm correct, very accurate time is
very important there.

It seems there is a IEEE 1588v2 that is even more accurate.

Will supporting IEEE 1588 also influence the running of system clock?

I haven't seen a 10Gb/s NIC with IEEE 1588 support so far, just 1Gb/s NIC's.
Loading...