Post by Kerry Main-----Original Message-----
Stephen Hoffman via Info-vax
Sent: January 17, 2017 6:26 PM
Subject: Re: [Info-vax] Are the LAN VCI 2.0 plans ambitious enough?
OpenVMS used to run into a pretty good performance wall around 8 cores,
though that's been creeping upward due to OpenVMS and application
changes. And it's all very much application-dependent, too. Further
up the core count, the overhead of maintaining a coherent cache limits
hardware performance. Amdahl and excessive MPSYNCH
overhead, etc., means adding cores doesn't add and variously reduces
aggregate performance. More than a few folks found AlphaServer ES4x
boxes were a sweet spot in the old Alpha line, in terms of price and
performance.
One big reason why ES45's (ES45B to be specific) were so popular and
why it took so long for Integrity servers to catch up to the ES45B's
performance was that the ES45B servers had
I remember that a little differently. Integrity servers caught up
with the ES4x class fairly quickly. It was the EV7 class Alpha that
held the performance lead for quite a while, that due in no small part
to the limitations of the older FSB design used in the Itanium
processors of that earlier era; in the pre-QPI designs. The EV7 torus
was exceedingly fast, for its era. Still does pretty well, all things
considered. Though power and cooling costs and hardware and software
and support costs have led to the retirement of many of the Alpha boxes.
Post by Kerry Main1.25Ghz EV68 CPU's and each had 16MB of L2 cache on board.
In isolation, clock speeds and cache sizes don't tell a particularly
useful story.
Post by Kerry MainOn EV68-class processors, there can be up to 16.0 MB of L2 cache
off-chip, while EV7 has a significantly smaller 1.75 MB of L2 cache
on-chip. In isolation, you might well assume that the far larger L2
cache on EV68 would be the better choice than the much smaller L2 of
EV7. Conversely, the bandwidth from main memory into L2 on EV7 is
significantly faster than on EV68 — if you have the bandwidth, the size
of the cache can be smaller. On-chip usually beats off-chip, due in no
small part to the improved interconnect speed from the locality — the
further from the processor, the slower the access. Cache being a
technique for contending with off-chip performance limitations, and if
you can use higher associatively and higher off-chip bandwidth and
faster main memory, you can use less cache. HP reports “The seven-way
set associative cache [of EV7] behaves like a 12-16 MB L2 cache [of
EV68].”
With EV68, the speeds and feeds are 9.6 GB/s to L1, 6.4 GB/s to L2, and
3.2 GB/s off-chip. There is one shared path out to memory and to I/O,
through the northbridge.
With EV7, 19.2 GB/s to L2 and 12.8 GB/s to local main memory through
the integrated RAMbus RDRAM memory controller; with the integrated
northbridge. Further, EV7 and its integrated northbridge has four
off-chip processor interconnects — usually labeled North, South, East
and West — with each having 6.4 GB/s speeds, and with an additional 6.4
GB/s port for I/O. Six interconnects. These interconnects are the basis
of memory, I/O and the multiprocessing torus.
EV7 had more bandwidth to main memory than EV68 had to L1 cache.
Cache gets added when the memory latency is larger and/or bandwidth is
lower. When the designer wants to avoid taking the slower path.
More cache is an indication of a mismatched design; where some
component — HDD, for instance — is much slower than some other
component — memory, I/O bus, main memory, processor — in the system
design. Except to the folks in marketing, core clock speed only
provides a comparison within a design. Not across processor designs.
And clock speeds get really interesting as core counts increase, but
thermals limit how many can be active and for how long, and whether
fewer cores can potentially be deliberately overclocked.
In the case of the ES4x classes, the applications ran faster on those
boxes up to four cores than then same applications ran on larger boxes,
or the incremental increase in performance was offset by the hardware
and software license costs, or was due to lower efficiency due to the
added processors.
Even in current-generation Intel x86-64 cores, more cores means lower
speeds, and throttling down cores allows a subset of the processor
cores to remain running longer or to sprint until thermals force
throttling. Adding cores inevitably makes the whole configuration
slower. Which means that lower-core count processors are faster for
fewer-stream application loads and the higher-core-count processors are
better at what used to be called batch-oriented loads.
The same trade-offs have held for OpenVMS all the way back to VAX and
ASMP. Some configurations are better at some tasks than others, and
there's always some point at which adding cores doesn't speed up
performance and often decreases it, both due to the parallelism in the
software and due to the added overhead of maintaining cache coherency
across the cores. VAX ASMP used the Qualify tool to determine whether
an application load would benefit from adding that second core. SMP
rounded off a number of the corners found in the ASMP design, but it's
all still whether the core count or some other factor is the
bottleneck. Adding more doesn't make things faster.
Post by Kerry MainAdding more cores provided diminishing returns, and the hardware and
license costs went up substantially. In the ProLiant product line,
HPE has stated ~80% of the folks use two socket servers, and more
than a few folks are looking at one- or two-socket blades or cartridges.
Density.
Part of the reason for smaller core servers is only partly due to
improved performance.
Isn't that what I just wrote?
Post by Kerry MainIt is also draconian per core license charges from companies like
Oracle, IBM and more recently Microsoft. Run VMware on a server and run
Oracle on 1 or 2 small VM's still requires licensing Oracle for ALL of
the cores on that server.
Whether per-core or per-socket or per-box, it's the aggregate costs.
Post by Kerry MainContainers are fundamentally customers seeking to achieve a local
minimum of pricing given licensing costs.
Even with all of the associated issues with them, containers are
fundamentally being driven by commodity OS Customers struggling big
time with VM sprawl issues.
Can't say I see much of a difference between tracking where and how a
container is running and tracking an image is running in a VM, in terms
of the struggling involved here. Containers are operating within a
guest within a virtual machine, and — to keep them isolated —
containers have their own IP addresses and other constructs intended to
keep them both isolated and to reduce resource conflicts, and VM guests
themselves have IP addresses for themselves including one for
management. So there's seemingly somewhat more tracking involved
here, as . Turtles all the way down.
And your faith in the trustworthiness of each individual application is
commendable.
Post by Kerry MainCan't see most vendors charging for operating system licenses or
product licenses investing heavily in undercutting their own licensing
schemes and revenues, but stranger things have happened,
Red Hat model of charging for support (OPEX - ongoing cash flows) and
forgoing high up front license charges (CAPEX) that require C level
approvals is the model for the future.
I see that a little differently. RHEL acquires developers and
early-stage and prototype deployments with completely free offerings —
an affordable entry-level configurations — and then captures the folks
that subsequently want or need somebody to blame; that want support.
DEC tried various pricing models, as has Compaq and HP and now HPE,
even with OpenVMS. VSI has made some comments in this area, too.
Some of what's been tried includes capacity on demand, per-user
licenses, server-only licenses, and other programs. But there
seemingly hasn't been a competitive entry-level offering for OpenVMS in
quite a while; arguably not since the VAX era. Particularly not once
the market was moving from VAX to Unix boxes and then from whatever to
Windows boxes, and more recently to Linux servers.
Current prices for Alpha and for Itanium are not particularly conducive
to wholly new projects with wholly new deployments.
Post by Kerry MainIn addition, there should be a major push to simplify the support
license model with more bundled offerings. One only has to read the
various web forums to hear how much Customers hate complicated and
confusing licensing schemes.
Complicated product offerings and complicated product packages and
complicated user interfaces and complicated deployments and complicated
morasses of containers and VM guests and blades and the rest aren't
popular, either. There's real skill and real discipline in making
products that are or that appear less complex, and that are easier to
use, and easier to license, and products that are easier to support and
maintain, too. Implementing piles of logical names, and twisty little
configuration files — some different, some alike, some parameter
databases, some text files — just isn't on that path, either.
Post by Kerry MainOracle, Microsoft, IBM, SAP etc are all in for some really tough times
as Customers look to significantly reduce their IT SW costs with
solutions they feel are "good enough".
This is the same trend that resulted in so many Customers jumping on
commodity OS's like Linux.
Welcome to commoditization.
Post by Kerry MainAll of these — containers, sandboxes, ASLR, etc — make attacks harder,
but not impossible. Not that OpenVMS hasn't already had malware and
attacks and even the occasional virus. There'll be much more of that
should the installed base increase or the value of the targets running
OpenVMS become interesting to attackers, too.
While popularity does have some bearing on hacker interest, it is also
about what is the payoff for the hacker + the time they invest in the
hack + just how much knowledge the hacker has in the target platform.
There's more money to be made elsewhere for now and it's often easier
to phish somebody than other attacks, or it's just better to keep any
available information on the vulnerabilities quiet. Why go public
with a vulnerability and allow defenders to fix it, if you might want
to use that vulnerability later. Or to use the vulnerability again,
for that matter.
Making the effort larger on the attackers — sandboxes, ASLR,
no-execute, network encryption, distributed authentication, etc — all
serve to increase the costs and the difficulties for the attackers,
though added security also increases costs on the vendor to develop
those defenses, to maintain it al, and adds costs for the ISVs and
customers to configure and maintain and troubleshoot it all. To use
the defenses, for that matter, as traditional mandatory access controls
— or the security provided by sandboxes, for that matter — can be a
hassle to use or to comply with.
Post by Kerry MainGiven OpenVMS's high value core areas of banks, stock exchanges,
manufacturing, power utilities, lotteries etc. if there was so many
security issues with OpenVMS as you like to promote, then why we have
not heard of these exploits over the last 15+ years?
[yes, there have been a few OpenVMS security issues over the years, but
no where near 20-30+ per month for every year going back decades]
When the disclosure practices differ, any attempts at comparing CVE
counts is codswallop. At best. Much like comparing cache sizes and
processor clocks, but I digress. It only takes one exposure — quite
possibly not in OpenVMS itself, but in some printer that then allows
access into SCS which allows access into the password hash which is far
too fast to calculate leading to an exposure and down we go... In
that case, a printer vulnerability got remote entrance, and two known
vulnerabilities within OpenVMS — no CVEs exist for either the
far-too-fast Purdy polynomial password hash or for the unencrypted
cluster communications transport, BTW — then allowed access into
OpenVMS itself.
As for CVE counts? Where are the OpenVMS CVEs for Apache? For ISC
BIND? For the NTP server? For OpenSSL? For the LPs that were
updated when OpenSSL was updated? For SMH? Some have been issued,
but these patches often haven't been made available or haven't been
quickly available, and/or CVEs often haven't been requested and
assigned or referenced, etc. For problems that do apply to OpenVMS.
As for breaches? Many institutions are loath to discuss breaches.
What VSI and HPE might have encountered is not public. Over the last
decade, I'm aware of various breaches involving OpenVMS, though whether
HPE or VSI ever heard about those or about any others, I don't know.
I'd like to have no vulnerabilities. But we're not in an era where
there are no security holes. In OpenVMS, or otherwise. We are in
an era when we have to patch much more quickly. Which means... What
can we learn from how other platforms roll out their patches, and what
mechanisms and tools are available to make that easier? Because in
this case, a rate of 1 per year is entirely equivalent to a rate of
20-30+ per month, in terms of the damage that can be caused by a
vulnerability. And it's just as often the ones you don't know or
didn't go looking for, too. That's entirely before any contemplations
around differences in disclosure practices or CVE request practices,
and whether any of it can be correlated. Which means... how can I
reduce or isolate and identify risks, and how can I more quickly
mitigate breaches? I'd like to trust that OpenVMS doesn't have some
appreciable subset of that twenty to thirty holes a month, but I spend
more than a little time reading disclosure reports looking for cases
that might (or do, as I've variously found) apply to OpenVMS, too.
And I do expect there to be cases where I have to roll out OpenVMS
patches yesterday, if not sooner.
Post by Kerry MainYes, Yes, there are security issues that need to be addressed in
OpenVMS and new functional areas added, but I just do not buy the
argument that a platform gets 20-30+ security issues highlighted EACH
and EVERY month simply because it is more popular. That is rubbish.
Folks can point to Linux or Windows security problems here, and the
lesson we should all be learning includes the need to be notified of
and to push out patches much more quickly. To look for and identify
problems, for that matter. Aiming Kali at an OpenVMS box can be
entertaining, too. You'll almost certainly find some issues, in most
any non-trivial configuration.
Going years or months or weeks or increasingly days or hours is problematic.
Apache TLS was down-revision for years.
Has the vendor-provided SMTP mail server offered encrypted client
connections yet? (I'll leave off START_TLS, as that's fodder for
another debate or two.)
Post by Kerry MainEngineering culture, architecture, history and small, creative Eng
teams are imho, what makes for good platform products in the long term.
More than a few engineering companies went out of business with that
model, too. Doing what the customer wants at a price they can
afford. VSI is working toward that and toward sustainable revenues,
and which means that there will be compromises across all areas,
including some involving security.
Post by Kerry MainPost by Kerry MainYes, this is also where LDAP/Enterprise Directory would be a critical
component of the overall next gen solution.
LDAP is already a critical component. OpenVMS is a more than a decade
late to that particular party.
Agree improvements in LDAP are needed.
This is a decade or two of remediation, too; what will be a massive and
long-running project. This means dragging all the isolated
information stores forward in OpenVMS and layered products and
third-party packages and local software; of new APIs and particularly
of learning to play well with other servers in modern network
environments via Kerberos and LDAP, and of integrating TLS and DTLS as
well. This is far beyond the existing and fussy and hard-to-configure
and hard-to-troubleshoot LDAP external authentication password
integration presently available. Probably part of replacing the
existing SYSUAF design, which is itself long overdue for replacement.
--
Pure Personal Opinion | HoffmanLabs LLC