Discussion:
RX2800 sporadic disk I/O slowdowns
(too old to reply)
Richard Jordan
2024-10-18 18:26:53 UTC
Permalink
RX2800 i4 server, 64GB RAM, 4 processors, P410i controller with 10 each
2TB disks in RAID 6, broken down into volumes.

We periodically (sometimes steady once a week, but sometimes more
frequent) one overnight batch job take much longer than normal to run.
Normal runtime about 30-35 minutes will take 4.5 - 6.5 hours. Several
images called by that job all run much slower than normal. At the end
the overall CPU and I/O counts are very close between a normal and a
long job.

The data files are very large indexed files. Records are read and
updated but not added in this job; output is just tabulated reports.

We've run monitors for all and disk and also built polling snapshot jobs
that check for locked/busy files, other active batch jobs, auto-checked
through system analyzer looking for any other processes accessing the
busy files at the same time as the problem batch (two data files show
long busy periods but we do not show any other process with channels to
that file at the same time except for backup, see next).

The backups start at the same time, but do not get to the data disks
until well after the problem job normally completes; that does cause
concurrent access to the problem files but it occurs only when the job
has already run long. so it is not the cause Overall backup time is
about the same regardless of how long the problem batch takes.

Monitor during a long run shows average and peak I/O rates to the disks
with busy files at about 1/2 of what they do for normal runs. We can
see that in the process snapshots too; the direct i/o count on a slow
run increases much more slowly than on a normal run but both normal and
long runs end up with close to the same CPU time and total I/Os.

Other jobs in monitor are somewhat slowed down but nowhere near as much
(and they do much less access).

Before anyone asks, the indexed files could probably use a
cleanup/rebuild, but if thats the cause would we see periodic
performance issues? I would expect them to be constant.

There is a backup server available, so I'm going to restore backups of
the two problem files to it and do rebuilds to see how long it takes;
that will determine how/when we can do it on the production server.



So something is apparently causing it to be I/O constrained but so far
we can't find it. Same concurrent processes, other jobs don't appear to
be slowed down much (but may be much less i/o sensitive or using data
on other disks, I threw that question to the devs).

Is there anything in the background below VMS that could cause this?
The controller doing drive checks or other maintenance activities?

Thanks for any ideas.
Craig A. Berry
2024-10-18 22:09:44 UTC
Permalink
Post by Richard Jordan
Monitor during a long run shows average and peak I/O rates to the disks
with busy files at about 1/2 of what they do for normal runs.
That is exactly what happens when the cache battery on a RAID controller
dies. Maybe yours is half-dead and sometimes takes a charge and
sometimes doesn't? MSA$UTIL should show the status of your P410.
Lawrence D'Oliveiro
2024-10-19 00:07:21 UTC
Permalink
Post by Craig A. Berry
That is exactly what happens when the cache battery on a RAID controller
dies.
I hate hardware RAID. Has VMS still not got any equivalent to mdraid?
Arne Vajhøj
2024-10-19 00:22:23 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Craig A. Berry
That is exactly what happens when the cache battery on a RAID controller
dies.
I hate hardware RAID. Has VMS still not got any equivalent to mdraid?
????

VMS got volume shadowing in 1986 I believe.

Arne
Lawrence D'Oliveiro
2024-10-19 00:35:43 UTC
Permalink
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Has VMS still not got any equivalent to mdraid?
VMS got volume shadowing in 1986 I believe.
Relevance being?
Arne Vajhøj
2024-10-19 00:39:33 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Has VMS still not got any equivalent to mdraid?
VMS got volume shadowing in 1986 I believe.
Relevance being?
It is OS provided software RAID.

Isn't that what you are asking for?

Arne
Lawrence D'Oliveiro
2024-10-19 00:56:51 UTC
Permalink
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Has VMS still not got any equivalent to mdraid?
VMS got volume shadowing in 1986 I believe.
Relevance being?
It is OS provided software RAID.
Does “volume shadowing” mean just RAID 1?

<https://manpages.debian.org/8/mdadm.8.en.html>
Arne Vajhøj
2024-10-19 01:12:30 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Has VMS still not got any equivalent to mdraid?
VMS got volume shadowing in 1986 I believe.
Relevance being?
It is OS provided software RAID.
Does “volume shadowing” mean just RAID 1?
I believe so.

0, 5, 6 and 10 requires a RAID controller.

Arne
Lawrence D'Oliveiro
2024-10-20 01:04:06 UTC
Permalink
Post by Craig A. Berry
That is exactly what happens when the cache battery on a RAID controller
dies.
Here’s another question: why does a disk controller need a battery-backed-
up cache? Or indeed any cache at all?

Is this because it tells lies to the OS, saying that the data has been
safely written to disk when in fact it hasn’t?
Arne Vajhøj
2024-10-20 01:22:08 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Craig A. Berry
That is exactly what happens when the cache battery on a RAID controller
dies.
Here’s another question: why does a disk controller need a battery-backed-
up cache? Or indeed any cache at all?
Better performance.
Post by Lawrence D'Oliveiro
Is this because it tells lies to the OS, saying that the data has been
safely written to disk when in fact it hasn’t?
If the battery is OK then it is reasonable safe as it will survive both
system crash and power outage.

Arne
Lawrence D'Oliveiro
2024-10-20 01:25:43 UTC
Permalink
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Here’s another question: why does a disk controller need a battery-
backed-
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
up cache? Or indeed any cache at all?
Better performance.
Think about it: the OS already has a filesystem cache in main RAM. That
runs at main RAM speeds. Whereas the disk controller is connected through
an interface to the CPU suitable only for disk I/O speeds. So any disk
controller cache is on the wrong side of that interface.
Arne Vajhøj
2024-10-20 01:31:51 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Here’s another question: why does a disk controller need a battery-
backed-
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
up cache? Or indeed any cache at all?
Better performance.
Think about it: the OS already has a filesystem cache in main RAM. That
runs at main RAM speeds. Whereas the disk controller is connected through
an interface to the CPU suitable only for disk I/O speeds. So any disk
controller cache is on the wrong side of that interface.
That cache is toast if the system crashes. So applications
that need to be sure data are written bypass that.

If data loss with crash is acceptable then OS cache is fine.

Arne
Lawrence D'Oliveiro
2024-10-20 01:48:48 UTC
Permalink
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Here’s another question: why does a disk controller need a battery-
backed-
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
up cache? Or indeed any cache at all?
Better performance.
Think about it: the OS already has a filesystem cache in main RAM. That
runs at main RAM speeds. Whereas the disk controller is connected
through an interface to the CPU suitable only for disk I/O speeds. So
any disk controller cache is on the wrong side of that interface.
That cache is toast if the system crashes. So applications
that need to be sure data are written bypass that.
You can say the same for applications keeping data in their own RAM
buffers. It’s a meaningless objection.
Arne Vajhøj
2024-10-20 01:52:16 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Here’s another question: why does a disk controller need a battery-
backed-
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
up cache? Or indeed any cache at all?
Better performance.
Think about it: the OS already has a filesystem cache in main RAM. That
runs at main RAM speeds. Whereas the disk controller is connected
through an interface to the CPU suitable only for disk I/O speeds. So
any disk controller cache is on the wrong side of that interface.
That cache is toast if the system crashes. So applications
that need to be sure data are written bypass that.
You can say the same for applications keeping data in their own RAM
buffers. It’s a meaningless objection.
The context is one where data loss is not acceptable. So data must be
persisted so that they can survive system crash and power failure.

File system cache and application cache are both no good in that case.

So it is raid controller with battery backup cache or no cache.

The first gives better performance than the second.

Very meaningful.

Arne
Lawrence D'Oliveiro
2024-10-20 03:51:32 UTC
Permalink
Post by Arne Vajhøj
The context is one where data loss is not acceptable.
Data loss is unavoidable. If the power goes out, all computations in
progress get lost. There’s no way around that.
Arne Vajhøj
2024-10-20 13:17:21 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
The context is one where data loss is not acceptable.
Data loss is unavoidable. If the power goes out, all computations in
progress get lost. There’s no way around that.
True, but that is not the issue.

There may be a lot of stuff in progress that may get lost, but
the point is that stuff that is complete and actions based on
it being complete may have been taken cannot be lost. When the
system comes up again then stuff in progress is like it never
happened while the completed stuff must still be completed.

Think transactions.

begin
# do a lot of stuff
# if the system crashes here the system comes up like nothing was done
commit
# if the system crashes here the changes must be there

Arne
Lawrence D'Oliveiro
2024-10-20 21:16:40 UTC
Permalink
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
The context is one where data loss is not acceptable.
Data loss is unavoidable. If the power goes out, all computations in
progress get lost. There’s no way around that.
True, but that is not the issue.
You said it was.
Post by Arne Vajhøj
Think transactions.
Which is an entirely separate thing from data loss. This is about data
integrity.

And those caching disk controllers are useless for ensuring this.
Arne Vajhøj
2024-10-20 21:23:08 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
The context is one where data loss is not acceptable.
Data loss is unavoidable. If the power goes out, all computations in
progress get lost. There’s no way around that.
True, but that is not the issue.
You said it was.
Loss of completed/committed data is the problem.

Work in progress is difficult to avoid loosing - and
it is not really desirable to have half done work
saved.
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Think transactions.
Which is an entirely separate thing from data loss.
No.

Remember what the D in ACID stands for.

Arne
Lawrence D'Oliveiro
2024-10-20 21:41:04 UTC
Permalink
Post by Arne Vajhøj
Loss of completed/committed data is the problem.
The problem is not data loss, it is loss of data integrity. This is why we
have transactions in databases and filesystems: on a crash or loss of
power, we want transactions to be either completely lost or completely
saved, not in some in-between incomplete state.

There is no caching disk controller that knows how to ensure this.
Arne Vajhøj
2024-10-20 23:08:41 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Loss of completed/committed data is the problem.
The problem is not data loss, it is loss of data integrity. This is why we
have transactions in databases and filesystems: on a crash or loss of
power, we want transactions to be either completely lost or completely
saved, not in some in-between incomplete state.
There is no caching disk controller that knows how to ensure this.
Let me try again.

DB write to plates & system crash => OK but slow
DB write to OS cache & system crash => potential problem with transaction
DB write to RAID controller with battery backup & system crash => OK and
fast

Arne
Arne Vajhøj
2024-10-21 00:13:50 UTC
Permalink
Post by Arne Vajhøj
Let me try again.
DB write to plates & system crash => OK but slow
The DB knows how to make this fast. Remember its cache is faster than any
disk controller.
This is where the DB is writing to plates.

You can add a fourth scenario:

DB write to DB cache & system crash => guaranteed problem with transaction

Arne
Lawrence D'Oliveiro
2024-10-21 00:28:30 UTC
Permalink
Post by Arne Vajhøj
Post by Arne Vajhøj
Let me try again.
DB write to plates & system crash => OK but slow
The DB knows how to make this fast. Remember its cache is faster than
any disk controller.
This is where the DB is writing to plates.
DB write to DB cache & system crash => guaranteed problem with
transaction
Transaction resilience is a standard thing with databases (and journalling
filesystems) going back decades.

Some DBMSes don’t even want to work through filesystems, they would rather
manage the raw storage themselves. This is why POSIX async I/O exists
<https://manpages.debian.org/7/aio.7.en.html>.
Arne Vajhøj
2024-10-21 00:32:41 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Arne Vajhøj
Let me try again.
DB write to plates & system crash => OK but slow
The DB knows how to make this fast. Remember its cache is faster than
any disk controller.
This is where the DB is writing to plates.
DB write to DB cache & system crash => guaranteed problem with transaction
Transaction resilience is a standard thing with databases (and journalling
filesystems) going back decades.
Yes.

But they can't do miracles.

To be sure to come up ok after a system crash it is either write to
plates or write to a cache that will survive the system crash (raid
controller cache with battery backup).
Post by Lawrence D'Oliveiro
Some DBMSes don’t even want to work through filesystems, they would rather
manage the raw storage themselves. This is why POSIX async I/O exists
<https://manpages.debian.org/7/aio.7.en.html>.
Yes. That is to avoid any dangerous OS/filesystem cache (and possible
for better performance).

Arne
Arne Vajhøj
2024-10-21 01:17:06 UTC
Permalink
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Transaction resilience is a standard thing with databases (and
journalling filesystems) going back decades.
Yes.
But they can't do miracles.
They can ensure, to a high degree of confidence, that the on-disk
structure is consistent. That is to say, each transaction is either
recorded as completed or not recorded at all, nothing in-between.
Only if it can rely on a successful write not being lost.
Post by Arne Vajhøj
To be sure to come up ok after a system crash it is either write to
plates or write to a cache that will survive the system crash (raid
controller cache with battery backup).
it can’t do miracles either, all it does is add another point of failure.
Yes - it can.

It is not impacted by a system crash.

And with a power outage they have hours to get power back and get the
data written (I believe 72 hours battery power is common).

Arne
Lawrence D'Oliveiro
2024-10-21 01:28:43 UTC
Permalink
Post by Arne Vajhøj
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Transaction resilience is a standard thing with databases (and
journalling filesystems) going back decades.
Yes.
But they can't do miracles.
They can ensure, to a high degree of confidence, that the on-disk
structure is consistent. That is to say, each transaction is either
recorded as completed or not recorded at all, nothing in-between.
Only if it can rely on a successful write not being lost.
In other words, that the disk controller is not lying to you when it says
a write has completed?
Post by Arne Vajhøj
Post by Arne Vajhøj
To be sure to come up ok after a system crash it is either write to
plates or write to a cache that will survive the system crash (raid
controller cache with battery backup).
Unfortunately, that controller cache can’t guarantee any of these
things: it can’t do miracles either, all it does is add another point
of failure.
Yes - it can.
It is not impacted by a system crash.
Now you are really starting to sound like a believer in miracles ...
Arne Vajhøj
2024-10-21 01:32:05 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Transaction resilience is a standard thing with databases (and
journalling filesystems) going back decades.
Yes.
But they can't do miracles.
They can ensure, to a high degree of confidence, that the on-disk
structure is consistent. That is to say, each transaction is either
recorded as completed or not recorded at all, nothing in-between.
Only if it can rely on a successful write not being lost.
In other words, that the disk controller is not lying to you when it says
a write has completed?
Just that is is not lying when it says that it got it.
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Arne Vajhøj
To be sure to come up ok after a system crash it is either write to
plates or write to a cache that will survive the system crash (raid
controller cache with battery backup).
Unfortunately, that controller cache can’t guarantee any of these
things: it can’t do miracles either, all it does is add another point
of failure.
Yes - it can.
It is not impacted by a system crash.
Now you are really starting to sound like a believer in miracles ...
A system crash and restart will blank RAM and wipe out all OS
and filesystem caches - it will not impact the cache in the
RAID controller.

Arne
Lawrence D'Oliveiro
2024-10-21 03:27:56 UTC
Permalink
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Transaction resilience is a standard thing with databases (and
journalling filesystems) going back decades.
Yes.
But they can't do miracles.
They can ensure, to a high degree of confidence, that the on-disk
structure is consistent. That is to say, each transaction is either
recorded as completed or not recorded at all, nothing in-between.
Only if it can rely on a successful write not being lost.
In other words, that the disk controller is not lying to you when it says
a write has completed?
Just that is is not lying when it says that it got it.
That’s not what it is saying. It is saying “write completed”.
Post by Arne Vajhøj
A system crash and restart will blank RAM and wipe out all OS
and filesystem caches - it will not impact the cache in the
RAID controller.
You hope.

You really are a believer in miracles, aren’t you?
Simon Clubley
2024-10-21 12:44:47 UTC
Permalink
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
Post by Arne Vajhøj
Post by Lawrence D'Oliveiro
Transaction resilience is a standard thing with databases (and
journalling filesystems) going back decades.
Yes.
But they can't do miracles.
They can ensure, to a high degree of confidence, that the on-disk
structure is consistent. That is to say, each transaction is either
recorded as completed or not recorded at all, nothing in-between.
Only if it can rely on a successful write not being lost.
In other words, that the disk controller is not lying to you when it says
a write has completed?
Just that is is not lying when it says that it got it.
That?s not what it is saying. It is saying ?write completed?.
Post by Arne Vajhøj
A system crash and restart will blank RAM and wipe out all OS
and filesystem caches - it will not impact the cache in the
RAID controller.
You hope.
You really are a believer in miracles, aren?t you?
No, he isn't. He does understand however that the stored data will
be written to disk when power is restored and then any transaction
or other recovery process can proceed from there.

In addition, for data which is that important, you can run a fully
shared cluster setup across multiple sites so that the loss of a server
at one site does not really impact ongoing operations across the
system as a whole.

At this point Lawrence, I can't tell if you are just a troll who is
just trying to provoke people or if you really believe what you are
saying because you do not have a detailed understanding of how this
stuff works.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Dan Cross
2024-10-21 21:42:26 UTC
Permalink
[snip]
At this point Lawrence, I can't tell if you are just a troll who is
just trying to provoke people or if you really believe what you are
saying because you do not have a detailed understanding of how this
stuff works.
Why not both?

- Dan C.
Craig A. Berry
2024-10-20 13:42:37 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
The context is one where data loss is not acceptable.
Data loss is unavoidable. If the power goes out, all computations in
progress get lost. There’s no way around that.
Yes, there is. It's called battery-backed cache on a RAID controller,
and when the power goes out, any writes in progress still get written
because the battery supplies power to the cache.
Lawrence D'Oliveiro
2024-10-20 21:17:27 UTC
Permalink
Post by Craig A. Berry
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
The context is one where data loss is not acceptable.
Data loss is unavoidable. If the power goes out, all computations in
progress get lost. There’s no way around that.
Yes, there is. It's called battery-backed cache on a RAID controller,
and when the power goes out, any writes in progress still get written
because the battery supplies power to the cache.
Unless the battery fails. Then you discover that your disk controller was
lying to you about saving the data you thought it was saving.
Simon Clubley
2024-10-21 12:35:13 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
The context is one where data loss is not acceptable.
Data loss is unavoidable. If the power goes out, all computations in
progress get lost. There?s no way around that.
Bollocks. It's called checkpointing and restart points for compute
based operations. It's called transaction recovery for I/O based
operations.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Arne Vajhøj
2024-10-21 19:22:17 UTC
Permalink
Post by Simon Clubley
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
The context is one where data loss is not acceptable.
Data loss is unavoidable. If the power goes out, all computations in
progress get lost. There?s no way around that.
Bollocks. It's called checkpointing and restart points for compute
based operations. It's called transaction recovery for I/O based
operations.
I read his comments as being about what happens between
checkpoints/commits.

It is not possible to recover that. But it is not
desirable to recover that, because that would be
an inconsistent state.

begin
update accounts set amount = amount - ? where id = ?
Post by Simon Clubley
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
here <<<<
update accounts set amount = amount + ? where id = ?
commit

or:

checkpoint m
m.accounts[id_a] -= xframt
Post by Simon Clubley
Post by Lawrence D'Oliveiro
Post by Arne Vajhøj
here <<<<
m.accounts[id_b] += xframt
checkpoint m

is really the same.

Arne
Richard Jordan
2024-10-21 15:45:26 UTC
Permalink
Post by Craig A. Berry
Post by Richard Jordan
Monitor during a long run shows average and peak I/O rates to the
disks with busy files at about 1/2 of what they do for normal runs.
That is exactly what happens when the cache battery on a RAID controller
dies.  Maybe yours is half-dead and sometimes takes a charge and
sometimes doesn't?  MSA$UTIL should show the status of your P410.
Initial check shows all good on the controller. No disk, battery,
cache, etc issues. I may add snapshotting the controller status via
MSA$UTIL to one of the other polling jobs in case battery or cache
status is varying.

The long-run occurred again, another Monday in a row. I haven't had time
to review our diagnostic polling yet to see if any different jobs or
user stuff was running concurrently but will shortly.

I'll take a look at T4; its been more than a decade since we tried using
it on a customer system.

Thanks
Craig A. Berry
2024-10-21 22:43:00 UTC
Permalink
Post by Craig A. Berry
Post by Richard Jordan
Monitor during a long run shows average and peak I/O rates to the
disks with busy files at about 1/2 of what they do for normal runs.
That is exactly what happens when the cache battery on a RAID controller
dies.  Maybe yours is half-dead and sometimes takes a charge and
sometimes doesn't?  MSA$UTIL should show the status of your P410.
Initial check shows all good on the controller.  No disk, battery,
cache, etc issues.  I may add snapshotting the controller status via
MSA$UTIL to one of the other polling jobs in case battery or cache
status is varying.
Yeah, it would be good to gather whatever info you can *while* the poor
performance is happening. It could be a lot of things, but with a
device that's designed to recover from or compensate for errors, looking
at after it's recovered from whatever is bothering it may not help much.
Volker Halle
2024-10-19 07:02:57 UTC
Permalink
Rich,

this would be a perfect oppurtunity to run T4 - and look at the disk
response times.

Volker.
Richard Jordan
2024-11-04 23:03:36 UTC
Permalink
Followup on this. I'm looking at one of Hein's presentations on RMS
indexed files, tuning, etc.

Presuming the system has plenty of memory and per autogen its state of
tune is pretty close to what autogen wants, is there any downside to
setting a count of global buffers on the large indexed data files
involved in this issue (the ones that show extended 'busy' channels in
system analyzer)? Can it cause any problems that would impact production?

We already tested setting a modest process RMS buffer count for indexed
files on the accounts used for batch operations, and that seems to make
a modest improvement in runtime and a significant reduction in direct
I/Os. Saved 3-4 minutes on a 32-34 minute runtime but DIOs dropped from
~5.1 million to ~4.3 million.

Unfortunately we still had two jobs run long, one over 7 hours so they
killed it, the other about 4.5 hours but with the same reduced 4.3M DIO
count. So it helped in general but did not make a difference to the
problem. I don't expect the global buffers to fix the problem either
but its worth testing for performance reasons.

Thanks
abrsvc
2024-11-05 12:22:31 UTC
Permalink
Note: Global buffers can be an advantage, but are not used when dealing
with duplicate secondary keys. Those are handled in local buffers. I
have seen drastic differences in performance when changing bucket sizes
more with secondary keys that have many duplicates than with primary
keyed access. Hein has some tools that analyze the statistics of
indexed files that report the number of I/Os per operation. High values
here can indicate inefficient use of buckets or buckets that are too
small forcing the use of more I/Os to retrieve buckets. Increasing the
bucket size can significantly reduce I/Os resulting in better overall
stats.

This won't directly address the reported slowdown, but might be a
trigger for it depending upon data locality.

Dan
Richard Jordan
2024-11-05 19:59:30 UTC
Permalink
Note:  Global buffers can be an advantage, but are not used when dealing
with duplicate secondary keys.  Those are handled in local buffers. I
have seen drastic differences in performance when changing bucket sizes
more with secondary keys that have many duplicates than with primary
keyed access.  Hein has some tools that analyze the statistics of
indexed files that report the number of I/Os per operation. High values
here can indicate inefficient use of buckets or buckets that are too
small forcing the use of more I/Os to retrieve buckets.  Increasing the
bucket size can significantly reduce I/Os resulting in better overall
stats.
This won't directly address the reported slowdown, but might be a
trigger for it depending upon data locality.
Dan
Dan,
Apparently the name of Hein's tools changed and I just found the
one referred to in the presentation. Will try it on backup copies of
the file (on the backup server) and see what it says.

We tested doing a plain convert on all of the files involved in
this situation on the backup server, and that task may be doable one
file per weekend, but if the tuning apps require changes that mean doing
an unload/reload of the file, going to have t find out how long that
takes; backup windows are tight and except for rare VMS upgrade days (or
when we moved from the RX3600 to these new servers), downtime is very
hard to get.
Volker Halle
2024-11-05 16:32:02 UTC
Permalink
Post by Richard Jordan
We periodically (sometimes steady once a week, but sometimes more
frequent) one overnight batch job take much longer than normal to run.
Normal runtime about 30-35 minutes will take 4.5 - 6.5 hours.  Several
images called by that job all run much slower than normal.  At the end
the overall CPU and I/O counts are very close between a normal and a
long job.
If 'overall CPU and I/O counts' are about the same, please re-consider
my advice to run T4. Look at the disk response times and I/O queue
length and compare a 'good' and a 'slow' run.

If 'the problem' would be somewhere in the disk IO sub-system, changing
RMS buffers will only 'muddy the waters'.

Volker.
Richard Jordan
2024-11-05 19:51:20 UTC
Permalink
Post by Volker Halle
Post by Richard Jordan
We periodically (sometimes steady once a week, but sometimes more
frequent) one overnight batch job take much longer than normal to run.
Normal runtime about 30-35 minutes will take 4.5 - 6.5 hours.  Several
images called by that job all run much slower than normal.  At the end
the overall CPU and I/O counts are very close between a normal and a
long job.
If 'overall CPU and I/O counts' are about the same, please re-consider
my advice to run T4. Look at the disk response times and I/O queue
length and compare a 'good' and a 'slow' run.
If 'the problem' would be somewhere in the disk IO sub-system, changing
RMS buffers will only 'muddy the waters'.
Volker.
Volker,
we are getting T4 running on the backup server to re-learn it; its
been more than 10 years since we played with it on another box.

I have monitor running and have been checking the I/O rates and
queue lengths during the 30+ minute runs and the multi-hour runs, and
the only diffs there are the overall I/O rates to the two disks are much
lower on the long runs than on the normal short ones.

But we'll try T4 and see what it shows once I'm happy with it on
the backup server.

This stuff is interfering with getting the 8.4-2L3 testing done so
we can upgrade the production server asap.
Volker Halle
2024-11-07 10:15:48 UTC
Permalink
...
     I have monitor running and have been checking the I/O rates and
queue lengths during the 30+ minute runs and the multi-hour runs, and
the only diffs there are the overall I/O rates to the two disks are much
lower on the long runs than on the normal short ones.
Rich,

did you consider running some disk-IO benchmarking tool ? On the two
disks sometimes affected by the problem ? And on other disks on this
RAID controller ?

This could provide some baseline achievable I/O rates and response
times. You could then run those test while 'the problem' exists and
during the 'short runs'.

If you also see the problem with a standard disk-IO benchmark,
considerations about local/global buffers may be less important.

There is the DISKBLOCK tool on the Freeware CDs, but I also have a more
current version on: https://eisner.decuserve.org/~halle/#diskblock

DISKBLOCK has a 'TEST' command to performance disk performance testing
(read-only and/or read-write).

Volker.

Loading...