Discussion:
HP HSV100 controller down due to failed batteries (Alpha ES47 - HP StorageWorks 3000)
Add Reply
h***@gmail.com
2020-09-29 15:00:56 UTC
Reply
Permalink
I know this may be only peripherally related, but we have a system which is down completely right now because our HSV100 controllers have batteries which have failed. So far we have only been able to source batteries which are not functional. We are in day two and have been looking for ways to disable the caching on the controllers so we can bring them up without batteries.

Does anyone know if this is possible and if so, how to do it?
abrsvc
2020-09-29 15:32:27 UTC
Reply
Permalink
Post by h***@gmail.com
I know this may be only peripherally related, but we have a system which is down completely right now because our HSV100 controllers have batteries which have failed. So far we have only been able to source batteries which are not functional. We are in day two and have been looking for ways to disable the caching on the controllers so we can bring them up without batteries.
Does anyone know if this is possible and if so, how to do it?
Not sure where you are looking, but NewEgg has these in stock. Not cheal though...

link: https://www.newegg.com/p/1A0-001P-01WA5

As does ISM Web:

Link: https://www.ismweb.com/product/new-hp-235870-001-battery-pack-2v-15ah-for-hsv100-controller/

Not sure about stock status here, but hte price is better...

Link: https://www.zoro.com/hp-battery-pack-2v-15ah-235870-001-oem/i/G0739031/
Carl Friedberg
2020-09-29 15:56:19 UTC
Reply
Permalink
That is old stuff. I recall, when a client owned an EVA4000 (HSV200), that
DEC Field Service[*] would come in and replace those batteries on a PM,
schedule basis. Those were the good old days. Carl
[*] Maybe it was not that long ago, but it was our usual DEC field service
person...
Post by h***@gmail.com
Post by h***@gmail.com
I know this may be only peripherally related, but we have a system which
is down completely right now because our HSV100 controllers have batteries
which have failed. So far we have only been able to source batteries which
are not functional. We are in day two and have been looking for ways to
disable the caching on the controllers so we can bring them up without
batteries.
Post by h***@gmail.com
Does anyone know if this is possible and if so, how to do it?
Not sure where you are looking, but NewEgg has these in stock. Not cheal though...
link: https://www.newegg.com/p/1A0-001P-01WA5
https://www.ismweb.com/product/new-hp-235870-001-battery-pack-2v-15ah-for-hsv100-controller/
Not sure about stock status here, but hte price is better...
https://www.zoro.com/hp-battery-pack-2v-15ah-235870-001-oem/i/G0739031/
_______________________________________________
Info-vax mailing list
http://rbnsn.com/mailman/listinfo/info-vax_rbnsn.com
--
www.comets.com
Stephen Hoffman
2020-09-29 15:55:19 UTC
Reply
Permalink
Post by h***@gmail.com
I know this may be only peripherally related, but we have a system
which is down completely right now because our HSV100 controllers have
batteries which have failed. So far we have only been able to source
batteries which are not functional. We are in day two and have been
looking for ways to disable the caching on the controllers so we can
bring them up without batteries.
Does anyone know if this is possible and if so, how to do it?
That's an Enterprise Virtual Array 3000 (EVA3000) Fibre Channel Storage
Controller, roughly fifteen years back, for those following along at
home.

IIRC, the Universal StorageWorks disks used in that were the Fibre
Channel variant, and not the SCSI variant.

I'll presume there's a reason y'all haven't rolled in your backups, and
preferably rolled in those backups onto something a little newer than
that EVA3000.

And no, I don't recall a means to temporarily mount the storage
read-only and non-cached, on the EVA series storage controllers. Nor
where the EVA configuration metadata is stored.
--
Pure Personal Opinion | HoffmanLabs LLC
Randy Hancock
2020-09-29 16:40:29 UTC
Reply
Permalink
Post by Stephen Hoffman
Post by h***@gmail.com
I know this may be only peripherally related, but we have a system
which is down completely right now because our HSV100 controllers have
batteries which have failed. So far we have only been able to source
batteries which are not functional. We are in day two and have been
looking for ways to disable the caching on the controllers so we can
bring them up without batteries.
Does anyone know if this is possible and if so, how to do it?
That's an Enterprise Virtual Array 3000 (EVA3000) Fibre Channel Storage
Controller, roughly fifteen years back, for those following along at
home.
IIRC, the Universal StorageWorks disks used in that were the Fibre
Channel variant, and not the SCSI variant.
I'll presume there's a reason y'all haven't rolled in your backups, and
preferably rolled in those backups onto something a little newer than
that EVA3000.
And no, I don't recall a means to temporarily mount the storage
read-only and non-cached, on the EVA series storage controllers. Nor
where the EVA configuration metadata is stored.
--
Pure Personal Opinion | HoffmanLabs LLC
Yep, this is some old stuff. I am a programmer working here and really not a hardware nor much of an O/S guy. Hardware isn't really in my purview and unfortunately the reliability of these things probably is a double-edged sword as some pieces get forgotten about. New director was in the process of reevaluating everything and making decisions when this happened. We are in the process of getting batteries refurbished now and, from what I am told, resetting the chips on the batteries so that they are not "expired". I realize better maintenance should have been done (or old hardware replaced) but it is incomprehensible to me that a device like this isn't able to run in an uncached mode.

Again, outside my normal work, but I am hoping once the batteries are rebuilt that the controller will simply pick up where it left off and recognize all of the disks in the SAN it is connected to.

Thanks for all comments and help.

Regards

Randy
John H. Reinhardt
2020-09-29 17:19:05 UTC
Reply
Permalink
Post by Randy Hancock
Post by Stephen Hoffman
Post by h***@gmail.com
I know this may be only peripherally related, but we have a system
which is down completely right now because our HSV100 controllers have
batteries which have failed. So far we have only been able to source
batteries which are not functional. We are in day two and have been
looking for ways to disable the caching on the controllers so we can
bring them up without batteries.
Does anyone know if this is possible and if so, how to do it?
That's an Enterprise Virtual Array 3000 (EVA3000) Fibre Channel Storage
Controller, roughly fifteen years back, for those following along at
home.
IIRC, the Universal StorageWorks disks used in that were the Fibre
Channel variant, and not the SCSI variant.
I'll presume there's a reason y'all haven't rolled in your backups, and
preferably rolled in those backups onto something a little newer than
that EVA3000.
And no, I don't recall a means to temporarily mount the storage
read-only and non-cached, on the EVA series storage controllers. Nor
where the EVA configuration metadata is stored.
--
Pure Personal Opinion | HoffmanLabs LLC
Yep, this is some old stuff. I am a programmer working here and really not a hardware nor much of an O/S guy. Hardware isn't really in my purview and unfortunately the reliability of these things probably is a double-edged sword as some pieces get forgotten about. New director was in the process of reevaluating everything and making decisions when this happened. We are in the process of getting batteries refurbished now and, from what I am told, resetting the chips on the batteries so that they are not "expired". I realize better maintenance should have been done (or old hardware replaced) but it is incomprehensible to me that a device like this isn't able to run in an uncached mode.
Again, outside my normal work, but I am hoping once the batteries are rebuilt that the controller will simply pick up where it left off and recognize all of the disks in the SAN it is connected to.
Thanks for all comments and help.
Regards
Randy
A number of years ago I tried putting together an EVA3000 system at home. I got the various parts on Ebay and since I was an HP LTE at the time, I got the software required for the management station. But the batteries were bad on the controllers and I never did find a way around that. The controllers would never come online until the batteries showed good (I assume, since I never got that far). I was in the process of rebuilding the batteries with new cells but had a chance to get a MSA P2000 G3 setup so gave up and gave all the EVA3000 stuff away when I moved to Texas.
--
John H. Reinhardt
Stephen Hoffman
2020-09-29 18:58:59 UTC
Reply
Permalink
... but it is incomprehensible to me that a device like this isn't able
to run in an uncached mode.
It'd have been nice to get no-cache read-only access, yes. Not a good
failure mode. Bad controller design, or maybe bad or missing doc.
I've seen various dual- and multi-redundant storage configurations fail
similarly over the years, when the earlier failures aren't noticed.
Old hardware is fine, so long as the site has spare parts, and the
ability and budget and time and skills to self-maintain.
I'd wager that that EVA will probably be replaced by a shelf of SSDs
with a massive I/O boost, if OpenVMS continues to be used here.
The failure chain here has a number of obvious links, and all of this
among the details that new director will be considering.
--
Pure Personal Opinion | HoffmanLabs LLC
Randy Hancock
2020-09-30 13:06:02 UTC
Reply
Permalink
We are much better this morning due to the very quick turnaround of battery refurbishing by Frontier Computer Corp. in Traverse City. We were back online by 8PM last night.

Thanks to everyone who offered information and condolences ;-)

Changes will be coming soon I am sure. An I/O boost could be interesting, though we rarely have any issues with slowness with 200+ concurrent users and 500+ processes.

Back to production and back to happy. That is what matters right now!
Carl Friedberg
2020-09-30 14:56:03 UTC
Reply
Permalink
I agree with Hoff. While the EVA line was well engineered for the day, it
is really old now. I checked, and the first hardware EVA 6000 manual was
dated 2005; the 3/5000 line is at least 2 years older. My client dumped
their EVA600 around 2013. As I said, field service would performed
scheduled replacement of the batteries, and I think we still had a failure
once due to a dead battery. Depending on the projected life of the current
cluster (and there's not much information about that), it might be possible
to replace a whole rackful (or more) of servers and EVA5000 with an
RX2800i6 (one, plus possibly a second to back it up), and the SSD's Steve
loves; and I agree. cheaper options.

On Tue, Sep 29, 2020 at 3:05 PM Stephen Hoffman via Info-vax <
Post by Stephen Hoffman
... but it is incomprehensible to me that a device like this isn't able
to run in an uncached mode.
It'd have been nice to get no-cache read-only access, yes. Not a good
failure mode. Bad controller design, or maybe bad or missing doc.
I've seen various dual- and multi-redundant storage configurations fail
similarly over the years, when the earlier failures aren't noticed.
Old hardware is fine, so long as the site has spare parts, and the
ability and budget and time and skills to self-maintain.
I'd wager that that EVA will probably be replaced by a shelf of SSDs
with a massive I/O boost, if OpenVMS continues to be used here.
The failure chain here has a number of obvious links, and all of this
among the details that new director will be considering.
--
Pure Personal Opinion | HoffmanLabs LLC
_______________________________________________
Info-vax mailing list
http://rbnsn.com/mailman/listinfo/info-vax_rbnsn.com
--
www.comets.com
Simon Clubley
2020-09-30 17:26:50 UTC
Reply
Permalink
Post by Carl Friedberg
I agree with Hoff. While the EVA line was well engineered for the day, it
is really old now. I checked, and the first hardware EVA 6000 manual was
dated 2005; the 3/5000 line is at least 2 years older. My client dumped
their EVA600 around 2013. As I said, field service would performed
scheduled replacement of the batteries, and I think we still had a failure
once due to a dead battery. Depending on the projected life of the current
cluster (and there's not much information about that), it might be possible
to replace a whole rackful (or more) of servers and EVA5000 with an
RX2800i6 (one, plus possibly a second to back it up), and the SSD's Steve
loves; and I agree. cheaper options.
$ set response/mode=good_natured

15 years isn't old in the VMS world. :-) :-)

I wouldn't mind betting there are people around still using the Mylex
RAID controller...

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Robert A. Brooks
2020-09-30 17:55:10 UTC
Reply
Permalink
Post by Simon Clubley
Post by Carl Friedberg
I agree with Hoff. While the EVA line was well engineered for the day, it
is really old now. I checked, and the first hardware EVA 6000 manual was
dated 2005; the 3/5000 line is at least 2 years older. My client dumped
their EVA600 around 2013. As I said, field service would performed
scheduled replacement of the batteries, and I think we still had a failure
once due to a dead battery. Depending on the projected life of the current
cluster (and there's not much information about that), it might be possible
to replace a whole rackful (or more) of servers and EVA5000 with an
RX2800i6 (one, plus possibly a second to back it up), and the SSD's Steve
loves; and I agree. cheaper options.
$ set response/mode=good_natured
15 years isn't old in the VMS world. :-) :-)
I wouldn't mind betting there are people around still using the Mylex
RAID controller...
EISNER (the DECUServe system) was using that stupid adapter until a couple of years
ago, when VSI offered the use of one of its fibre channel arrays.

I really hated that adapter when I had to use it in the mid-90's.
--
-- Rob
Rod Regier
2020-10-06 16:00:24 UTC
Reply
Permalink
The good news is the that the RX2800 i6 motherboard integral P410i RAID controller will fail soft to no BBWC if the FBWC supercapacitor decides to kak. Replace the supercapacitor (offline) and you'll be back in the BBWC business. The P400 and SA6402 Smartarray controllers that use batteries fail soft for write caching in the same manner. I suspect the same holds true for the P800 but I don't have any experience with it.

Chances are all of the Compaq/HP/HPE Smartarray RAID controllers function in the same fashion in regards to this feature.
Rod Regier
2020-10-06 16:05:06 UTC
Reply
Permalink
BTW I'm running 3rd party SATA SSDs in an RX2800 I6 system w/o issues so far.
(Samsung EVO 860's)

Loading...