Discussion:
Doing time on VAX/VMS
(too old to reply)
D***@comcast.net
2017-01-19 18:40:39 UTC
Permalink
Raw Message
$! count_timestamps.com
$!
$! For example;
$! $ @count_timestamps
$!
$! 259 ticks below threshold
$! 79 12:36:50.63
$!1073 ticks below threshold
$! 79 12:37:10.64
$!1053 ticks below threshold
$! 79 12:37:30.64
$!
$! the following routine hard loops so use with caution especially on single
$! processor systems
$! the procedure displays how many times in a row the timestamp matches above
$! the threshold and then the count and timestamp above the threshold
$
$ if p1 .eqs. "" then $ p1 = 55 ! this might not be a good default for yours
$
$ old_name = f$process()
$ set process/priority=0/name=looping
$
$ threshold = f$integer( p1 )
$ c = 0
$ current_ts = ""
$ below_threshold = 0
$ on control_y then $ goto done
$ loop:
$ ts = f$fao( "!%T", 0 ) ! returns time in HH:MM:SS.SS format
$ if ts .eqs. current_ts
$ then
$ c = c + 1
$ else
$ if c .gt. threshold
$ then
$ write sys$output f$fao( "!4UL ticks below threshold", below_threshold )
$ below_threshold = 0
$ write sys$output f$fao( "!4UL !AS", c, current_ts )
$ else
$ below_threshold = below_threshold + 1
$ endif
$ current_ts = ts
$ c = 1
$ endif
$ goto loop
$ done:
$ set process/priority=4/name='old_name ! assume default priority of 4
Bob Gezelter
2017-01-19 21:18:45 UTC
Permalink
Raw Message
I wrote a MACRO routine to break a VMS quadword time values into seconds and
nano-seconds. To my surprise, the nano-seconds value only contained precision
to centi-seconds. All decimal places beyond hundredths-of-a-second were zeros.
Given that the VMS quadword time values are documented to contain time in
100-nano-second units, I expected more digits of real precision.
My source for quadword time values is SYS$GETTIM. I'm running on a VAXstation
more lower bits are set) on different VMS machines? Would I get more actual
precision if I ran my routine on a larger VMS machine?
As a side note, I examined the milli-seconds value returned by the VAX$C RTL
routine "ftime". The milli-second digit was always zero, yielding an effective
precision of only centi-seconds.
Jim,

For the record, this is true for VAX processors. Alpha and Itanium processors use a higher frequency clock (you may speculate how I know this).

- Bob Gezelter, http://www.rlgsc.com
Arne Vajhøj
2017-01-20 02:08:45 UTC
Permalink
Raw Message
Post by Bob Gezelter
I wrote a MACRO routine to break a VMS quadword time values into
seconds and nano-seconds. To my surprise, the nano-seconds value
only contained precision to centi-seconds. All decimal places
beyond hundredths-of-a-second were zeros.
Given that the VMS quadword time values are documented to contain
time in 100-nano-second units, I expected more digits of real
precision.
My source for quadword time values is SYS$GETTIM. I'm running on a
VAXstation 3500. Does SYS$GETTIM return quadword time values of
different quality (read: more lower bits are set) on different VMS
machines? Would I get more actual precision if I ran my routine on
a larger VMS machine?
As a side note, I examined the milli-seconds value returned by the
VAX$C RTL routine "ftime". The milli-second digit was always zero,
yielding an effective precision of only centi-seconds.
For the record, this is true for VAX processors. Alpha and Itanium
processors use a higher frequency clock (you may speculate how I know
this).
Note that in general unit and granularity for get current time
functionality are two different items.

Them being different is not VMS specific.

Win32 GetTickCount and *nix gettimeofday/clock_gettime
has similar characteristics.

Arne
Craig A. Berry
2017-01-20 03:16:25 UTC
Permalink
Raw Message
Post by Arne Vajhøj
Post by Bob Gezelter
I wrote a MACRO routine to break a VMS quadword time values into
seconds and nano-seconds. To my surprise, the nano-seconds value
only contained precision to centi-seconds. All decimal places
beyond hundredths-of-a-second were zeros.
Given that the VMS quadword time values are documented to contain
time in 100-nano-second units, I expected more digits of real
precision.
My source for quadword time values is SYS$GETTIM. I'm running on a
VAXstation 3500. Does SYS$GETTIM return quadword time values of
different quality (read: more lower bits are set) on different VMS
machines? Would I get more actual precision if I ran my routine on
a larger VMS machine?
As a side note, I examined the milli-seconds value returned by the
VAX$C RTL routine "ftime". The milli-second digit was always zero,
yielding an effective precision of only centi-seconds.
For the record, this is true for VAX processors. Alpha and Itanium
processors use a higher frequency clock (you may speculate how I know
this).
Note that in general unit and granularity for get current time
functionality are two different items.
Them being different is not VMS specific.
Win32 GetTickCount and *nix gettimeofday/clock_gettime
has similar characteristics.
I'm not sure why Bob responded to a post from 24 years ago, but these
days we have SYS$GETTIM_PREC as well as SYS$GETTIM if you want more
precision.
Bob Gezelter
2017-01-20 03:28:20 UTC
Permalink
Raw Message
Post by Craig A. Berry
Post by Arne Vajhøj
Post by Bob Gezelter
I wrote a MACRO routine to break a VMS quadword time values into
seconds and nano-seconds. To my surprise, the nano-seconds value
only contained precision to centi-seconds. All decimal places
beyond hundredths-of-a-second were zeros.
Given that the VMS quadword time values are documented to contain
time in 100-nano-second units, I expected more digits of real
precision.
My source for quadword time values is SYS$GETTIM. I'm running on a
VAXstation 3500. Does SYS$GETTIM return quadword time values of
different quality (read: more lower bits are set) on different VMS
machines? Would I get more actual precision if I ran my routine on
a larger VMS machine?
As a side note, I examined the milli-seconds value returned by the
VAX$C RTL routine "ftime". The milli-second digit was always zero,
yielding an effective precision of only centi-seconds.
For the record, this is true for VAX processors. Alpha and Itanium
processors use a higher frequency clock (you may speculate how I know
this).
Note that in general unit and granularity for get current time
functionality are two different items.
Them being different is not VMS specific.
Win32 GetTickCount and *nix gettimeofday/clock_gettime
has similar characteristics.
I'm not sure why Bob responded to a post from 24 years ago, but these
days we have SYS$GETTIM_PREC as well as SYS$GETTIM if you want more
precision.
Craig,

Bob responded because the thread became active again (and noticed that the answer was, due to the architectural changes, obsolete.

- Bob Gezelter, http://www.rlgsc.com
Stephen Hoffman
2017-01-20 22:26:05 UTC
Permalink
Raw Message
...but these days we have SYS$GETTIM_PREC as well as SYS$GETTIM if you
want more precision.
Or use the clock_gettime call, from C.

Most or all of this newer timekeeping support is available on Alpha and
Itanium, and not VAX.

I've always wondered if the person that named that SYS$GETTIM_PREC call
knew there was a difference between precision and accuracy, and was
making a comment about the OpenVMS implementation of timekeeping. But
that's fodder for another day.
--
Pure Personal Opinion | HoffmanLabs LLC
IanD
2017-01-21 02:15:18 UTC
Permalink
Raw Message
Post by Stephen Hoffman
...but these days we have SYS$GETTIM_PREC as well as SYS$GETTIM if you
want more precision.
Or use the clock_gettime call, from C.
Most or all of this newer timekeeping support is available on Alpha and
Itanium, and not VAX.
I've always wondered if the person that named that SYS$GETTIM_PREC call
knew there was a difference between precision and accuracy, and was
making a comment about the OpenVMS implementation of timekeeping. But
that's fodder for another day.
--
Pure Personal Opinion | HoffmanLabs LLC
Until you mentioned it, I had never really given it much thought other than to loosely associate the two without bothering to dig deeper

I found this quite good as a simple explanation

https://www.ncsu.edu/labwrite/Experimental%20Design/accuracyprecision.htm

I've always had this general notion (a head based heuristic) that IT problems pretty much are caused by or somehow related to time. Race conditions, synchronization, timing, recording bla bla bla. Errors of logic excluded of course...

How to enhance VMS clusters so that they can use a relative time source? I guess the original design was based on a universal time measurement and therefore required a single based time source for coordination purposes?

How do distributed systems ultimately resolve synchronization unless they use a single synchronization source?

How to take VMS and it's clustering to a hierarchical system or better still, fully relational (I do like Google Groups and it's circular concept)
Stephen Hoffman
2017-01-21 16:52:51 UTC
Permalink
Raw Message
Post by IanD
I've always had this general notion (a head based heuristic) that IT
problems pretty much are caused by or somehow related to time. Race
conditions, synchronization, timing, recording bla bla bla. Errors of
logic excluded of course...
Many of the traditional programming languages lack support for
threading and related, but that gets lost in most discussions. Newer
incarnations of C get better here, though OpenVMS itself requires KP
Threads to get anywhere, and that's not been integrated into any of the
languages. Constructs such as libdispatch/GCD are absent, and ASTs —
which I'd originally found very useful, but libdispatch/GCD and blocks
and dispatch queues are just as nice to program and much more flexible
than ASTs, and the blocks keep the related code together rather than
necessarily and inherently scattering the related code around the
module as happens with ASTs. POSIX threads in C and KP threads in
OpenVMS can work here, too — but they're rather more complex to use,
and tend to scatter the source code logic around.

https://en.wikipedia.org/wiki/Grand_Central_Dispatch
https://github.com/apple/swift-corelibs-libdispatch
Post by IanD
How to enhance VMS clusters so that they can use a relative time
source? I guess the original design was based on a universal time
measurement and therefore required a single based time source for
coordination purposes?
How do distributed systems ultimately resolve synchronization unless
they use a single synchronization source?
There's a joke: a person with one watch knows what time it is. A
person with two watches is never sure. Between the different
computers and clocks, and the distances between the computers and the
clocks, and with the inevitable occasional packet losses and restarts
and assorted skewage, things start to get murky.

For many apps, accurate time is rather less interesting than the local
arrival order and transactional controls. Or of the most recent
update, for status data that can be UDP-multicast or such. What the
particular application expects and needs. Hopefully few of us are
still saddled with local time values assumed as as inviolate
monotomic-ascending indexes; that we're avoiding most of the problems
that can arise with erant use of CLOCK_REALTIME and CLOCK_MONOTONIC or
equivalent, but that's fodder for another discussion or three. There
are many presentations and papers on the general topic of distributed
time and timekeeping:

http://www.cis.upenn.edu/~lee/07cis505/Lec/lec-ch6-synch1-PhysicalClock-v2.pdf
http://the-paper-trail.org/blog/distributed-systems-theory-for-the-distributed-systems-engineer/

http://the-paper-trail.org/blog/consensus-protocols-paxos/

This also all ties back to the previous discussions of the CAP theorem
and related here in the comp.os.vms newsgroup, too:
https://en.wikipedia.org/wiki/CAP_theorem
Post by IanD
How to take VMS and it's clustering to a hierarchical system or better
still, fully relational (I do like Google Groups and it's circular
concept)
Clustering doesn't do much for you that can't also be done with file
shares and Zookeeper or etcd or such.

https://zookeeper.apache.org
https://github.com/coreos/etcd

For folks using DLM on OpenVMS for selecting a primary or leader or
coordinator, the following discussion (from several years ago) should
look very familiar:

http://techblog.outbrain.com/2011/07/leader-election-with-zookeeper/

That, and the DLM sequence involving $enq[w] and $deq[w] and ASTS for
this same leadership selection task certainly works, but involves
absurd amounts of glue code. I've ended up writing the necessary code
for abstracting these APIs, as most other folks have done. Flipping
huge piles of glue code to elect a leader, or otherwise. Ponder
whether new-to-OpenVMS developers want to have to learn and write and
support those same abstractions, for what should be a common and
available task within a cluster? They'll certainly ponder loading
Zookeeper or etcd and Kubernates and running on Centos or RHEL or Void
or otherwise, though.

OpenVMS and clustering is stuck at the DLM and shared file access era,
and the developer is then necessarily off to other tools and APIs for
service discover, configuration, distributed authentication, app
distribution and coordination and app containment, and other such.
More glue code there, and absolutely no clear examples of how to do any
of those tasks in the "proper OpenVMS way", either. Zippo for service
discovery, outside of some old RPC bits nobody uses or DNS SRV records
or such, or maybe rolling your own integration with LDAP. There's
also little documentation around keeping OpenVMS apps from stepping on
each other, too — experienced OpenVMS developers know how, but we're
all one facility prefix collision or one leaking DEC C configuration
logical name away from a Really Weird Bug. And I've been hitting
those cases more often, as we're spending more time and effort
integrating apps from disparate sources — from app stacking and from
longer dependency chains, or whatever y'all want to call it. That's
before discussing potentially nefarious apps and tools, and y'all are
seriously deluded if you think that's not eventually going to happen
(to you), if it hasn't already (and you just don't know it).

Then there's that cluster shared storage access and distributed file
locking is great right up until that same disk storage — HDDs or SSD —
becomes a bottleneck, and that HBVS and related approaches just don't
scale, even now. How many of us have tussled with hot files and hot
disks and excessive I/O rates?

Then consider whether we really expecting to be sharing traditional
disk storage going forward when we probably want to be running directly
out of memory and journaling to slower storage or shadowed servers for
the purposes of redundancy? I'm working more and more with data in
memory and only journaling writes to local non-volatile storage or to
memory or storage on another server, and spending somewhat less time
working around sharing disks across hosts. That old shared-HDD
approach from clustering still works certainly — and SSD helps
alleviate some of the performance limitations — but I just don't see
the popularity of that approach doing anything but declining over the
next decade, as compared with apps using RDMA or other access to
transfer and to journal data (or ZeroMQ or RabbitMQ or otherwise, for
transactional processing), and with the application data residing in
volatile and increasingly non-volatile byte-addressable memory.

TL;DR: Go Big or Go Home
--
Pure Personal Opinion | HoffmanLabs LLC
Kerry Main
2017-01-21 21:23:21 UTC
Permalink
Raw Message
-----Original Message-----
Stephen Hoffman via Info-vax
Sent: January 21, 2017 11:53 AM
Subject: Re: [Info-vax] Doing time on VAX/VMS
Post by IanD
I've always had this general notion (a head based heuristic) that IT
problems pretty much are caused by or somehow related to time.
Race
Post by IanD
conditions, synchronization, timing, recording bla bla bla. Errors of
logic excluded of course...
Many of the traditional programming languages lack support for
threading and related, but that gets lost in most discussions. Newer
incarnations of C get better here, though OpenVMS itself requires KP
Threads to get anywhere, and that's not been integrated into any of the
languages. Constructs such as libdispatch/GCD are absent, and ASTs —
which I'd originally found very useful, but libdispatch/GCD and blocks
and dispatch queues are just as nice to program and much more
flexible than ASTs, and the blocks keep the related code together
rather than necessarily and inherently scattering the related code
around the
module as happens with ASTs. POSIX threads in C and KP threads in
OpenVMS can work here, too — but they're rather more complex to
use, and tend to scatter the source code logic around.
https://en.wikipedia.org/wiki/Grand_Central_Dispatch
https://github.com/apple/swift-corelibs-libdispatch
Post by IanD
How to enhance VMS clusters so that they can use a relative time
source? I guess the original design was based on a universal time
measurement and therefore required a single based time source for
coordination purposes?
How do distributed systems ultimately resolve synchronization
unless
Post by IanD
they use a single synchronization source?
There's a joke: a person with one watch knows what time it is. A
person with two watches is never sure. Between the different
computers and clocks, and the distances between the computers and
the clocks, and with the inevitable occasional packet losses and restarts
and assorted skewage, things start to get murky.
For many apps, accurate time is rather less interesting than the local
arrival order and transactional controls. Or of the most recent
update, for status data that can be UDP-multicast or such. What the
particular application expects and needs. Hopefully few of us are
still saddled with local time values assumed as as inviolate monotomic-
ascending indexes; that we're avoiding most of the problems that can
arise with erant use of CLOCK_REALTIME and CLOCK_MONOTONIC or
equivalent, but that's fodder for another discussion or three. There
are many presentations and papers on the general topic of distributed
http://www.cis.upenn.edu/~lee/07cis505/Lec/lec-ch6-synch1-
PhysicalClock-v2.pdf
http://the-paper-trail.org/blog/distributed-systems-theory-for-the-
distributed-systems-engineer/
http://the-paper-trail.org/blog/consensus-protocols-paxos/
This also all ties back to the previous discussions of the CAP theorem
https://en.wikipedia.org/wiki/CAP_theorem
Post by IanD
How to take VMS and it's clustering to a hierarchical system or better
still, fully relational (I do like Google Groups and it's circular
concept)
Clustering doesn't do much for you that can't also be done with file
shares and Zookeeper or etcd or such.
https://zookeeper.apache.org
https://github.com/coreos/etcd
For folks using DLM on OpenVMS for selecting a primary or leader or
coordinator, the following discussion (from several years ago) should
http://techblog.outbrain.com/2011/07/leader-election-with-
zookeeper/
That, and the DLM sequence involving $enq[w] and $deq[w] and ASTS
for this same leadership selection task certainly works, but involves
absurd amounts of glue code. I've ended up writing the necessary code
for abstracting these APIs, as most other folks have done. Flipping
huge piles of glue code to elect a leader, or otherwise. Ponder
whether new-to-OpenVMS developers want to have to learn and
write and support those same abstractions, for what should be a
common and
available task within a cluster? They'll certainly ponder loading
Zookeeper or etcd and Kubernates and running on Centos or RHEL or
Void or otherwise, though.
OpenVMS and clustering is stuck at the DLM and shared file access era,
and the developer is then necessarily off to other tools and APIs for
service discover, configuration, distributed authentication, app
distribution and coordination and app containment, and other such.
More glue code there, and absolutely no clear examples of how to do any
of those tasks in the "proper OpenVMS way", either. Zippo for service
discovery, outside of some old RPC bits nobody uses or DNS SRV
records
or such, or maybe rolling your own integration with LDAP. There's
also little documentation around keeping OpenVMS apps from
stepping on each other, too — experienced OpenVMS developers
know how, but we're all one facility prefix collision or one leaking DEC C
configuration
logical name away from a Really Weird Bug. And I've been hitting
those cases more often, as we're spending more time and effort
integrating apps from disparate sources — from app stacking and from
longer dependency chains, or whatever y'all want to call it. That's
before discussing potentially nefarious apps and tools, and y'all are
seriously deluded if you think that's not eventually going to happen (to
you), if it hasn't already (and you just don't know it).
Then there's that cluster shared storage access and distributed file
locking is great right up until that same disk storage — HDDs or SSD —
becomes a bottleneck, and that HBVS and related approaches just don't
scale, even now. How many of us have tussled with hot files and hot
disks and excessive I/O rates?
Then consider whether we really expecting to be sharing traditional
disk storage going forward when we probably want to be running
directly out of memory and journaling to slower storage or shadowed
servers for
the purposes of redundancy? I'm working more and more with data in
memory and only journaling writes to local non-volatile storage or to
memory or storage on another server, and spending somewhat less time
working around sharing disks across hosts. That old shared-HDD
approach from clustering still works certainly — and SSD helps alleviate
some of the performance limitations — but I just don't see the
popularity of that approach doing anything but declining over the next
decade, as compared with apps using RDMA or other access to transfer
and to journal data (or ZeroMQ or RabbitMQ or otherwise, for
transactional processing), and with the application data residing in
volatile and increasingly non-volatile byte-addressable memory.
TL;DR: Go Big or Go Home
There is a fundamental issue with what you are saying above.

Yes, there are alternatives to shared disk (OpenVMS, Linux/GFS, z/OS), but while one does need follow the recommended shared disk prog models, the shared nothing model's inter-node management (node add/deletes), HA, data consistency, data sharding (perf impacts), file replication, DR/inter-site load balancing etc. are all done at the Application level. That is a HUGE amount of additional coding, complexity etc that the App developer needs to consider and design into their App code. And each Application group might decide to do it differently.

With a shared disk model, these aspects of the solution are primarily handled at the OS layer e.g. the Cluster takes care of node adds/deletes - not the application; data consistency is handled by HBVS - not file replication (gotta love the term "eventual consistency"), designed at the App level; any node can directly update any data on any system - not via DB update routing designed, implemented and maintained at the App level.

So yes, while its not perfect and the OpenVMS shared disk model certainly does warrant upgrades and improvements (cluster node limits expanded as just one example), the reality is that the shared disk model does work and is considered by most Customers who use it, rock solid.

As I have stated before, comparing the traditional shared nothing programming model to the shared disk model is really asking the question "do you want a dragster (shared nothing) or a Prorsche (shared disk)?"

Reference:
http://www.scaledb.com/wp-content/uploads/2015/11/Shared-Nohing-vs-Shared-Disk-WP_SDvSN.pdf

or unwrapped link:
http://bit.ly/2dScx9k
Extract "Comparing shared-nothing and shared-disk in benchmarks is analogous to comparing a dragster and a Porsche. The dragster, like the hand-tuned shared-nothing database, will beat the Porsche in a straight quarter mile race. However, the Porsche, like a shared-disk database, will easily beat the dragster on regular roads. If your selected benchmark is a quarter mile straightaway that tests all out speed, like Sysbench, a shared-nothing database will win. However, shared-disk will perform better in real world environments."


Regards,

Kerry Main
Kerry dot main at starkgaming dot com
Stephen Hoffman
2017-01-23 01:33:55 UTC
Permalink
Raw Message
Post by Kerry Main
http://www.scaledb.com/wp-content/uploads/2015/11/Shared-Nohing-vs-Shared-Disk-WP_SDvSN.pdf
http://bit.ly/2dScx9kExtract "Comparing shared-nothing and shared-disk
in benchmarks is analogous to comparing a dragster and a Porsche. The
dragster, like the hand-tuned shared-nothing database, will beat the
Porsche in a straight quarter mile race. However, the Porsche, like a
shared-disk database, will easily beat the dragster on regular roads.
If your selected benchmark is a quarter mile straightaway that tests
all out speed, like Sysbench, a shared-nothing database will win.
However, shared-disk will perform better in real world environments."
"It depends". That having chased lock contention issues and I/O
saturation around more than a few clusters, and dealt with databases
that saturated the spindles. There are always trade-offs. I'm not
expecting HDDs or SSDs to be the path forward here either, particularly
as fast memory increases and as non-volatile storage moves closer to
the processors. It's hard to even share a disk when you're not really
working with disks, save for journaling or similar tasks — many of the
performance assumptions and configurations I've worked with over the
years — and that OpenVMS was designed for and assumes — may not hold up
very well in current and new designs and configurations.

https://www.quora.com/What-are-the-differences-between-shared-nothing-shared-memory-and-shared-storage-architectures-in-the-context-of-scalable-computing-analytics

http://www.benstopford.com/2009/11/24/understanding-the-shared-nothing-architecture/

https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html

Storage hardware and I/O rates and latencies and hardware prices have
changed massively since clustering was designed, and which leads to
different approaches and designs for the applications. Something akin
to HBVS via RDMA would be very interesting, for instance.

http://www-03.ibm.com/support/techdocs/atsmastr.nsf/5cb5ed706d254a8186256c71006d2e0a/d9f3f2da420ad83886257bba0078cdfb/$FILE/2015-03-04-RoCE%20FAQ.pdf


Stark will undoubtedly be learning more about this whole area,
particularly as y'all start to examine whether to geographically
distribute and to shard your workloads, and which and even potentially
whether the classic OpenVMS assumptions around storage and clustering
make sense to continue to use.

But again, "it depends".
--
Pure Personal Opinion | HoffmanLabs LLC
Kerry Main
2017-01-23 02:30:40 UTC
Permalink
Raw Message
-----Original Message-----
Stephen Hoffman via Info-vax
Sent: January 22, 2017 8:34 PM
Subject: Re: [Info-vax] Doing time on VAX/VMS
Post by Kerry Main
http://www.scaledb.com/wp-content/uploads/2015/11/Shared-
Nohing-vs-Sha
Post by Kerry Main
red-Disk-WP_SDvSN.pdf
http://bit.ly/2dScx9kExtract "Comparing shared-nothing and shared-
disk
Post by Kerry Main
in benchmarks is analogous to comparing a dragster and a Porsche.
The
Post by Kerry Main
dragster, like the hand-tuned shared-nothing database, will beat the
Porsche in a straight quarter mile race. However, the Porsche, like a
shared-disk database, will easily beat the dragster on regular roads.
If your selected benchmark is a quarter mile straightaway that tests
all out speed, like Sysbench, a shared-nothing database will win.
However, shared-disk will perform better in real world
environments."
"It depends". That having chased lock contention issues and I/O
saturation around more than a few clusters, and dealt with databases
that saturated the spindles. There are always trade-offs. I'm not
expecting HDDs or SSDs to be the path forward here either, particularly
as fast memory increases and as non-volatile storage moves closer to
the processors. It's hard to even share a disk when you're not really
working with disks, save for journaling or similar tasks — many of the
performance assumptions and configurations I've worked with over
the years — and that OpenVMS was designed for and assumes — may
not hold up very well in current and new designs and configurations.
There will always be exceptions as no one model is the "best" for all situations.

Agree that in order to scale higher with less overhead, some traditional OpenVMS clustering design enhancements will need to be made.

RoCEv2 would be one cluster interconnect addition that imho, would help with both of these goals.

Btw, VSI added RoCEV2 for future consideration in their last roadmap, so perhaps we will see something post V9+.
https://www.quora.com/What-are-the-differences-between-shared-
nothing-shared-memory-and-shared-storage-architectures-in-the-
context-of-scalable-computing-analytics
Nice article. The following is a nice extract in support of the OpenVMS shared everything model:
"In next-generation data center architectures, there is a shift to massively faster and less-congested data-center fabric that would shift many compute problems toward a preference for Shared-Everything."
http://www.benstopford.com/2009/11/24/understanding-the-
shared-nothing-architecture/
https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.
html
Storage hardware and I/O rates and latencies and hardware prices
have changed massively since clustering was designed, and which
leads to
different approaches and designs for the applications. Something akin
to HBVS via RDMA would be very interesting, for instance.
Agree - any cluster overhead that needs to communicate with other nodes would hopefully be something that might be adapted to communicate over RoCEV2.

This might be similar to the old CI adapter that was a dedicated special HW adapter that was optimized for cluster communications.
http://www-
03.ibm.com/support/techdocs/atsmastr.nsf/5cb5ed706d254a8186256c
71006d2e0a/d9f3f2da420ad83886257bba0078cdfb/$FILE/2015-03-04-
RoCE%20FAQ.pdf
That doc is a mainframe centered document that is not very current in terms of RoCEV2. RDMA is the old standard and the new RoCEV2 RDMA standard provides more compatibility with existing Ethernet drivers.

This is a pretty good alternative doc: (see fig 1 and fig 2 for why RoCEV2 is so fast and low latency)
http://www.mellanox.com/related-docs/whitepapers/roce_in_the_data_center.pdf
"The latest version of RoCE adds even greater functionality. By changing the packet encapsulation to include IP and UDP headers, RDMA can now be used across both L2 and L3 networks. This enables Layer 3 routing, which brings RDMA to networks with multiple subnets. IP multicast is now also possible thanks to the updated version."
Stark will undoubtedly be learning more about this whole area,
particularly as y'all start to examine whether to geographically
distribute and to shard your workloads, and which and even potentially
whether the classic OpenVMS assumptions around storage and
clustering make sense to continue to use.
But again, "it depends".
Agree.


Regards,

Kerry Main
Kerry dot main at starkgaming dot com

Loading...