Discussion:
scp or sftp: file is "raw", needs to be parsed - possible to work around that?
(too old to reply)
Gregory Reut
2021-05-19 11:29:14 UTC
Permalink
Hi all,

Trying to scp a very large file off an Alpha VMS 8.3 system. Larger the file, longer it takes to start the transfer. This scp message in verbose mode gives a clue:
SCP Source file is "raw", and it needs to be parsed

Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?

Thanks in advance!
Greg
Simon Clubley
2021-05-19 12:04:52 UTC
Permalink
Post by Gregory Reut
Hi all,
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
Thanks in advance!
Greg
Is this a sequential file or are you trying to directly transfer the
records from a relative file or indexed file ?

If its a sequential file, is it possible to create the file as a
Stream LF file or would that cause data corruption due to any possible
embedded characters in the data ?

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Gregory Reut
2021-05-19 12:11:47 UTC
Permalink
Hi Simon,

Thank you for replying!
It's an backup saveset,created with BACKUP/IMAGE command from a DKA device, so I'd guess it's a sequential file. I eould give it a try with the stream if I'd know how to do that :) At least I can experiment on much shorter files (restore them on the target VMS system).

Best regards,
Greg
Post by Simon Clubley
Post by Gregory Reut
Hi all,
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
Thanks in advance!
Greg
Is this a sequential file or are you trying to directly transfer the
records from a relative file or indexed file ?
If its a sequential file, is it possible to create the file as a
Stream LF file or would that cause data corruption due to any possible
embedded characters in the data ?
Simon.
--
Walking destinations on a map are further away than they appear.
Chris Townley
2021-05-19 12:18:37 UTC
Permalink
Post by Gregory Reut
Hi Simon,
Thank you for replying!
It's an backup saveset,created with BACKUP/IMAGE command from a DKA device, so I'd guess it's a sequential file. I eould give it a try with the stream if I'd know how to do that :) At least I can experiment on much shorter files (restore them on the target VMS system).
Best regards,
Greg
Post by Simon Clubley
Post by Gregory Reut
Hi all,
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
Thanks in advance!
Greg
Is this a sequential file or are you trying to directly transfer the
records from a relative file or indexed file ?
If its a sequential file, is it possible to create the file as a
Stream LF file or would that cause data corruption due to any possible
embedded characters in the data ?
Simon.
--
Walking destinations on a map are further away than they appear.
Have you tried zipping (with "-V") the saveset?

I have had success using that in the past

Chris
--
Chris Townley
Gregory Reut
2021-05-19 12:46:47 UTC
Permalink
Thank you Chris,
I didn't try to zip the files as I don't think they will become small enough, but I'll give it a try!
Best regards,
Greg
Post by Chris Townley
Post by Gregory Reut
Hi Simon,
Thank you for replying!
It's an backup saveset,created with BACKUP/IMAGE command from a DKA device, so I'd guess it's a sequential file. I eould give it a try with the stream if I'd know how to do that :) At least I can experiment on much shorter files (restore them on the target VMS system).
Best regards,
Greg
Post by Simon Clubley
Post by Gregory Reut
Hi all,
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
Thanks in advance!
Greg
Is this a sequential file or are you trying to directly transfer the
records from a relative file or indexed file ?
If its a sequential file, is it possible to create the file as a
Stream LF file or would that cause data corruption due to any possible
embedded characters in the data ?
Simon.
--
Walking destinations on a map are further away than they appear.
Have you tried zipping (with "-V") the saveset?
I have had success using that in the past
Chris
--
Chris Townley
Chris Townley
2021-05-19 12:53:48 UTC
Permalink
Post by Gregory Reut
Thank you Chris,
I didn't try to zip the files as I don't think they will become small enough, but I'll give it a try!
Best regards,
Greg
Post by Chris Townley
Post by Gregory Reut
Hi Simon,
Thank you for replying!
It's an backup saveset,created with BACKUP/IMAGE command from a DKA device, so I'd guess it's a sequential file. I eould give it a try with the stream if I'd know how to do that :) At least I can experiment on much shorter files (restore them on the target VMS system).
Best regards,
Greg
Post by Simon Clubley
Post by Gregory Reut
Hi all,
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
Thanks in advance!
Greg
Is this a sequential file or are you trying to directly transfer the
records from a relative file or indexed file ?
If its a sequential file, is it possible to create the file as a
Stream LF file or would that cause data corruption due to any possible
embedded characters in the data ?
Simon.
--
Walking destinations on a map are further away than they appear.
Have you tried zipping (with "-V") the saveset?
I have had success using that in the past
Chris
The benefit of using ZIP is so much the compression, but retaining VMS
attributes in a known format,

In the past I had a VMS system where the data backups were first disc to
disc, then zipped up in a ZIP container file copied to a Unix machine
over NFS and then to tape.
Worked well, and the compression was such that I could retain 2 weeks
worth on-line
--
Chris Townley
Grant Taylor
2021-05-19 15:35:49 UTC
Permalink
Post by Chris Townley
The benefit of using ZIP is so much the compression, but retaining VMS
attributes in a known format,
Does this mean that the zip file is functioning more as a container than
as compression? Wherein the container is used to hold VMS attributes
/internally/, thus isolating them from any / all /external/ interference
/ analysis?
--
Grant. . . .
unix || die
Chris Townley
2021-05-19 15:47:01 UTC
Permalink
Post by Grant Taylor
Post by Chris Townley
The benefit of using ZIP is so much the compression, but retaining VMS
attributes in a known format,
Does this mean that the zip file is functioning more as a container than
as compression?  Wherein the container is used to hold VMS attributes
/internally/, thus isolating them from any / all /external/ interference
/ analysis?
Not 100% sure, but it does preserve VMS attributes (as long as you zip
up using "-V"

I used it before on a lightly loaded Itanium cluster, so using even a
high compression wasn't a problem. On the old development microvax, I
used minimum, or no compression.

Also if being copied over unchecked FTP, ZIP would complain if the file
got corrupted.

Even with a password, it is not secure from prying eyes!
--
Chris Townley
Simon Clubley
2021-05-19 17:18:17 UTC
Permalink
Post by Gregory Reut
Hi Simon,
Thank you for replying!
It's an backup saveset,created with BACKUP/IMAGE command from a DKA device, so I'd guess it's a sequential file. I eould give it a try with the stream if I'd know how to do that :) At least I can experiment on much shorter files (restore them on the target VMS system).
Now I know that, I suggest you follow the recommendations from other
posters to use ZIP, with the VMS attributes encoding option enabled,
to transfer the saveset. Converting the saveset to Stream LF is not
a suitable approach.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Richard Whalen
2021-05-19 12:25:17 UTC
Permalink
Post by Gregory Reut
Hi all,
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
Thanks in advance!
Greg
You don't state what SFTP/SCP you are using (TCP/IP Services, MultiNet, TCPware or SSH for OpenVMS).
The message is misleading. In the software available from Process Software this message is issued when the file attributes have not been fetched and need to be fetched.
The SFTP/SCP file transfer implementation needs the file size before transferring the file. If the file is a VMS text file the file may need to be read to determine the actual length.
Options to reduce this:
- convert the file to stream-lf, which is a native format for SFTP/SCP and will not require counting.
- store the file on an ods-5 disk and have the file length hint set.
Richard Whalen
2021-05-19 12:31:52 UTC
Permalink
Post by Richard Whalen
Post by Gregory Reut
Hi all,
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
Thanks in advance!
Greg
You don't state what SFTP/SCP you are using (TCP/IP Services, MultiNet, TCPware or SSH for OpenVMS).
The message is misleading. In the software available from Process Software this message is issued when the file attributes have not been fetched and need to be fetched.
The SFTP/SCP file transfer implementation needs the file size before transferring the file. If the file is a VMS text file the file may need to be read to determine the actual length.
- convert the file to stream-lf, which is a native format for SFTP/SCP and will not require counting.
- store the file on an ods-5 disk and have the file length hint set.
Now that I see that the file is a backup save set. I agree with the idea of ZIP "-V" to preserve the file.
Gregory Reut
2021-05-19 17:57:47 UTC
Permalink
Hi Richard,

Sorry for not mentioning the versions. Here's the scp version info I can get form the executable:
SYSTEM> scp -V
tcpip$ssh_scp2.exe:Scp2/SCP2.C:2002: CRTL version (SYS$SHARE:DECC$SHARE ident) is: V8.3-01

I have one spare disk which is just enough size to keep the iamge backup I need to transfer, so I've reinitialized it with ODS5 and now zipping the saveset and moving it there.
Regarding file size hint - is it enough to use ANALYZE/RMS/UPDATE in the file, or there is anything else (or something different) should be done for it?

Thanks in advance!
Greg
Post by Richard Whalen
Post by Gregory Reut
Hi all,
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
Thanks in advance!
Greg
You don't state what SFTP/SCP you are using (TCP/IP Services, MultiNet, TCPware or SSH for OpenVMS).
The message is misleading. In the software available from Process Software this message is issued when the file attributes have not been fetched and need to be fetched.
The SFTP/SCP file transfer implementation needs the file size before transferring the file. If the file is a VMS text file the file may need to be read to determine the actual length.
- convert the file to stream-lf, which is a native format for SFTP/SCP and will not require counting.
- store the file on an ods-5 disk and have the file length hint set.
Volker Halle
2021-05-20 05:25:32 UTC
Permalink
Post by Richard Whalen
The SFTP/SCP file transfer implementation needs the file size before transferring the file. If the file is a VMS text file the file may need to be read to determine the actual length.
- convert the file to stream-lf, which is a native format for SFTP/SCP and will not require counting.
- store the file on an ods-5 disk and have the file length hint set.
Richard,

exactly which information makes SFTP/SCP decide 'it is a VMS text file' ?

Volker.
Volker Halle
2021-05-20 14:38:25 UTC
Permalink
The problems is apparently not seen when using SCP between OpenVMS systems.

When trying to SCP such a big file to a Linux system, the following messages are shown:
...
tcpip$ssh_scp2.exe:SshFCTransfer/SSHFC_TRANSFER.C:396: Source file is "raw", and it needs to be parsed.
tcpip$ssh_scp2.exe:Ssh2SftpServer/SSHFILEXFERS.C:3196: Received SSH_FXP_STAT
tcpip$ssh_scp2.exe:Ssh2SftpServer/SSHFILEXFERS.C:3271: Statting file `/DUA1300/GOOGLE/xxxxxx.bck'

and there it 'hangs' reading the whole file for determining the record count and data bytes...

$ ANAL/RMS/UPDATE does not help on an OpenVMS backup saveset, as it only updates the file length hints for VAR and VFC files and only on ODS-5 volumes.

IF SCP would really check just the file length hints, when receiving a SSH_FXP_STAT command, patching the file header could be a workaround.

$ DUMP/HEADER/BLOCK=COUNT=0 file shows (example for variable length .TXT file):
...
File length hints
Record count: 1340 <<< quadword at offset %X8C in file header
Data byte count: 48756 <<< quadword at offset %X94 in file header

The DISKBLOCK Freeware utility could be used to patch these 2 fields in the file header:
Record count = number of backup records/blocks in the saveset
Data byte count = Record count times backup block size

The SPLIT.C code example (see above) has been successfully tested on an OpenVMS Backup savest, which could be successfully split into smaller files and re-combined afterwards. BACKUP/LIST on the re-combined new saveset worked fine.

Volker.
Hans Bachner
2021-05-23 19:28:40 UTC
Permalink
Post by Volker Halle
The problems is apparently not seen when using SCP between OpenVMS systems.
[...]
Greg - is this part of an effort to migrate a VMS system to the emulator?

If so, and given Volker's statement, you could install a new VMS
instance in the emulator (having a basic system to boot from is a good
idea anyway), probably apply relevant patches, install OpenVMS and UCX
(or NET-APP-SUP-* licenses) and configure TCP/IP.

Then you can SCP the backup saveset from VMS to VMS and unpack it there
to the "real" target disk.

Hope this helps,
Hans.
Stephen Hoffman
2021-05-20 14:54:24 UTC
Permalink
Post by Volker Halle
exactly which information makes SFTP/SCP decide 'it is a VMS text file' ?
OpenVMS has occasionally had bad file-related size data returned via
the C APIs. I've seen zero sizes and 32767 sizes returned for LRL too,
which made the behavior of some dependent code that much more
interesting. Kinda like the NaT or nil or not-a-valid-value return
design here, as these old APIs just live for the burn-it-all-down
in-band out-of-range values. But I digress. The C size-related calls
could also be glacial, when they worked. Some of the previous
discussions:

https://groups.google.com/g/comp.os.vms/c/3eu8HtNxgeM/m/ZrwK0GjHBwAJ

https://groups.google.com/g/comp.os.vms/c/dhUzZ7HifPo/m/QIVOhPuwAgAJ

https://groups.google.com/g/comp.os.vms/c/H8rDEgeKS2Q/m/ucVbJeTqw-MJ

That there are (were?) reportedly ~eight different stat()
implementations/behaviors around was news to me, and sounds like a bug
farm, too.

There's at least one C logical name DECC$DEFAULT_LRL involved here,
too. This to ensure run-time confusion, and skewed test results.

And now I'm pondering what the eventual advent of 64-bit file sizes
will do to existing apps and APIs...
--
Pure Personal Opinion | HoffmanLabs LLC
Craig A. Berry
2021-05-20 15:53:33 UTC
Permalink
Post by Stephen Hoffman
https://groups.google.com/g/comp.os.vms/c/dhUzZ7HifPo/m/QIVOhPuwAgAJ
That there are (were?) reportedly ~eight different stat()
implementations/behaviors around was news to me, and sounds like a bug
farm, too.
And last time I checked none of them implements the timestamps as struct
timespec as required by POSIX since 2008 rather than time_t.
Stephen Hoffman
2021-05-20 16:30:13 UTC
Permalink
Post by Craig A. Berry
Post by Stephen Hoffman
https://groups.google.com/g/comp.os.vms/c/dhUzZ7HifPo/m/QIVOhPuwAgAJ
That there are (were?) reportedly ~eight different stat()
implementations/behaviors around was news to me, and sounds like a bug
farm, too.
And last time I checked none of them implements the timestamps as
struct timespec as required by POSIX since 2008 rather than time_t.
AFAICT and with the exception of POSIX threads 1003.1-1996, VSI claims
no compliance with any of the POSIX or Single UNIX Specification
standards. So there's that.

https://vmssoftware.com/docs/VSI_OVMS_SPDQS_OS_V842L1I_UPD1.pdf
https://vmssoftware.com/docs/VSI_C_spd.pdf

The next related upgrade past the recently-added C99 support will be
with the Clang port for OpenVMS on x86-64.

I'm not sure what the claimed "common mode" compliance "as implemented
on UNIX systems" is about either, other than maybe either badly
obsolete (if K&R), or just misnamed.

I'm also not sure why VSI has two different webpages for SPDs. Here's
the link with what seems to be the whole collection:
https://vmssoftware.com/resources/documentation/

Ah, and I hadn't noticed that single-host shared stream I/O is on the
roadmap. The native compilers are all arriving 2021H2; somewhere
between V9.1 and V9.2.
--
Pure Personal Opinion | HoffmanLabs LLC
Craig A. Berry
2021-05-20 18:14:50 UTC
Permalink
Post by Stephen Hoffman
Post by Craig A. Berry
Post by Stephen Hoffman
https://groups.google.com/g/comp.os.vms/c/dhUzZ7HifPo/m/QIVOhPuwAgAJ
That there are (were?) reportedly ~eight different stat()
implementations/behaviors around was news to me, and sounds like a
bug farm, too.
And last time I checked none of them implements the timestamps as
struct timespec as required by POSIX since 2008 rather than time_t.
AFAICT and with the exception of POSIX threads 1003.1-1996, VSI claims
no compliance with any of the POSIX or Single UNIX Specification
standards. So there's that.
Right. Although it might be surprising to some people that compiling
with /DEFINE=(_USE_STD_STAT=1) gives you a stat struct that conforms to
an older standard rather than any recent one. And it's still a
portability problem.
Post by Stephen Hoffman
https://vmssoftware.com/docs/VSI_OVMS_SPDQS_OS_V842L1I_UPD1.pdf
https://vmssoftware.com/docs/VSI_C_spd.pdf
The SPD for C doesn't say much about the CRTL other than you need to
have it to use C. The CRTL appears not to have its own SPD and if the
release notes for last year's C99 patch are available anywhere public, I
couldn't find them.
Post by Stephen Hoffman
The next related upgrade past the recently-added C99 support will be
with the Clang port for OpenVMS on x86-64.
CRTL changes and the clang++ port seem orthogonal to me -- I don't see
them getting a new CRTL written in time for the C++ release that is
supposed to happen this quarter (or even if that slips by a quarter or two).
Post by Stephen Hoffman
Ah, and I hadn't noticed that single-host shared stream I/O is on the
roadmap. The native compilers are all arriving 2021H2; somewhere between
V9.1 and V9.2.
It's there, and so is the PostgreSQL database that depends on it.
John Reagan
2021-05-20 19:06:52 UTC
Permalink
Right. Although it might be surprising to some people that compiling
with /DEFINE=(_USE_STD_STAT=1) gives you a stat struct that conforms to
an older standard rather than any recent one. And it's still a
portability problem.
Huh? The latest POSIX standard (issue 7, 2018) says:

The <sys/stat.h> header shall define the stat structure, which shall include at least the following members:

dev_t st_dev Device ID of device containing file.
ino_t st_ino File serial number.
mode_t st_mode Mode of file (see below).
nlink_t st_nlink Number of hard links to the file.
uid_t st_uid User ID of file.
gid_t st_gid Group ID of file.
[XSI][Option Start]
dev_t st_rdev Device ID (if file is character or block special).
[Option End]
off_t st_size For regular files, the file size in bytes.
For symbolic links, the length in bytes of the
pathname contained in the symbolic link.
[SHM][Option Start]
For a shared memory object, the length in bytes.
[Option End]
[TYM][Option Start]
For a typed memory object, the length in bytes.
[Option End]
For other file types, the use of this field is
unspecified.
struct timespec st_atim Last data access timestamp.
struct timespec st_mtim Last data modification timestamp.
struct timespec st_ctim Last file status change timestamp.
[XSI][Option Start]
blksize_t st_blksize A file system-specific preferred I/O block size
for this object. In some file system types, this
may vary from file to file.
blkcnt_t st_blocks Number of blocks allocated for this object.
[Option End]

Notice that it says "shall include at least the following members".

From the current <stat.h> in the __USING_STD_STAT section

struct stat {
__dev_t st_dev; /* device id */
__ino_t st_ino; /* file number */
__mode_t st_mode; /* file mode */
__MODEFILL( st_mode)
__nlink_t st_nlink; /* number of hard links */
__uid_t st_uid; /* user id */
__gid_t st_gid; /* group id */
__dev_t st_rdev;
__off_t st_size; /* file size in bytes */
__time_t st_atime; /* file access time */
__TIMEFILL( st_atime)
__time_t st_mtime; /* file mod time */
__TIMEFILL( st_mtime)
__time_t st_ctime; /* file creation time */
__TIMEFILL( st_ctime)
char st_fab_rfm; /* record format */
char st_fab_rat; /* record attributes */
char st_fab_fsz; /* fixed header size */
char st_fabFill;
unsigned st_fab_mrs; /* record size */
blksize_t st_blksize; /* filesystem-specific preferred I/O block size for this file */
blkcnt_t st_blocks; /* number of blocks allocated for this file */
char st_reserved[sizeof(__int64)*__STD_STAT_QWORDS_RESERVED];
};

# pragma __member_alignment __restore

#else /* end of __USING_STD_STAT section */

We provide every field that POSIX defines plus a few more.
Craig A. Berry
2021-05-20 19:41:28 UTC
Permalink
Post by John Reagan
Right. Although it might be surprising to some people that compiling
with /DEFINE=(_USE_STD_STAT=1) gives you a stat struct that conforms to
an older standard rather than any recent one. And it's still a
portability problem.
dev_t st_dev Device ID of device containing file.
ino_t st_ino File serial number.
mode_t st_mode Mode of file (see below).
nlink_t st_nlink Number of hard links to the file.
uid_t st_uid User ID of file.
gid_t st_gid Group ID of file.
[XSI][Option Start]
dev_t st_rdev Device ID (if file is character or block special).
[Option End]
off_t st_size For regular files, the file size in bytes.
For symbolic links, the length in bytes of the
pathname contained in the symbolic link.
[SHM][Option Start]
For a shared memory object, the length in bytes.
[Option End]
[TYM][Option Start]
For a typed memory object, the length in bytes.
[Option End]
For other file types, the use of this field is
unspecified.
struct timespec st_atim Last data access timestamp.
struct timespec st_mtim Last data modification timestamp.
struct timespec st_ctim Last file status change timestamp.
[XSI][Option Start]
blksize_t st_blksize A file system-specific preferred I/O block size
for this object. In some file system types, this
may vary from file to file.
blkcnt_t st_blocks Number of blocks allocated for this object.
[Option End]
Notice that it says "shall include at least the following members".
From the current <stat.h> in the __USING_STD_STAT section
struct stat {
__dev_t st_dev; /* device id */
__ino_t st_ino; /* file number */
__mode_t st_mode; /* file mode */
__MODEFILL( st_mode)
__nlink_t st_nlink; /* number of hard links */
__uid_t st_uid; /* user id */
__gid_t st_gid; /* group id */
__dev_t st_rdev;
__off_t st_size; /* file size in bytes */
__time_t st_atime; /* file access time */
__TIMEFILL( st_atime)
__time_t st_mtime; /* file mod time */
__TIMEFILL( st_mtime)
__time_t st_ctime; /* file creation time */
__TIMEFILL( st_ctime)
char st_fab_rfm; /* record format */
char st_fab_rat; /* record attributes */
char st_fab_fsz; /* fixed header size */
char st_fabFill;
unsigned st_fab_mrs; /* record size */
blksize_t st_blksize; /* filesystem-specific preferred I/O block size for this file */
blkcnt_t st_blocks; /* number of blocks allocated for this file */
char st_reserved[sizeof(__int64)*__STD_STAT_QWORDS_RESERVED];
};
# pragma __member_alignment __restore
#else /* end of __USING_STD_STAT section */
We provide every field that POSIX defines plus a few more.
Close, but not quite. Specifically, not the timestamps. VMS has, for
example:

time_t st_atime

POSIX has:

struct timespec st_atim

No "e" at the end, and it's a struct that has both second and nanosecond
components. Most implementations would, for backward compatibility,
define the following:

#define st_atime st_atim.tv_sec

But I have encountered code that used the timespec members directly and
of course didn't work out of the box on VMS.

At first glance it looks like the __TIMEFILL padding in the VMS stat
struct would allow you to overlay a timespec struct on top of a time_t
value in a binary compatible way as long as you are not using 64-bit
time_t (which I believe was never implemented); if it is ever
implemented, there will be no way to avoid binary-incompatible changes
to the stat struct if simultaneously changing the timestamps from time_t
to timespec.

Whether nanosecond precision is available from the file system is a
different question. But there would be source compatibility advantages
even if the nanosecond portion is rounded to the nearest 10-millisecond
tick or something.
John Reagan
2021-05-20 20:35:54 UTC
Permalink
Right. Although it might be surprising to some people that compiling
with /DEFINE=(_USE_STD_STAT=1) gives you a stat struct that conforms to
an older standard rather than any recent one. And it's still a
portability problem.
dev_t st_dev Device ID of device containing file.
ino_t st_ino File serial number.
mode_t st_mode Mode of file (see below).
nlink_t st_nlink Number of hard links to the file.
uid_t st_uid User ID of file.
gid_t st_gid Group ID of file.
[XSI][Option Start]
dev_t st_rdev Device ID (if file is character or block special).
[Option End]
off_t st_size For regular files, the file size in bytes.
For symbolic links, the length in bytes of the
pathname contained in the symbolic link.
[SHM][Option Start]
For a shared memory object, the length in bytes.
[Option End]
[TYM][Option Start]
For a typed memory object, the length in bytes.
[Option End]
For other file types, the use of this field is
unspecified.
struct timespec st_atim Last data access timestamp.
struct timespec st_mtim Last data modification timestamp.
struct timespec st_ctim Last file status change timestamp.
[XSI][Option Start]
blksize_t st_blksize A file system-specific preferred I/O block size
for this object. In some file system types, this
may vary from file to file.
blkcnt_t st_blocks Number of blocks allocated for this object.
[Option End]
Notice that it says "shall include at least the following members".
From the current <stat.h> in the __USING_STD_STAT section
struct stat {
__dev_t st_dev; /* device id */
__ino_t st_ino; /* file number */
__mode_t st_mode; /* file mode */
__MODEFILL( st_mode)
__nlink_t st_nlink; /* number of hard links */
__uid_t st_uid; /* user id */
__gid_t st_gid; /* group id */
__dev_t st_rdev;
__off_t st_size; /* file size in bytes */
__time_t st_atime; /* file access time */
__TIMEFILL( st_atime)
__time_t st_mtime; /* file mod time */
__TIMEFILL( st_mtime)
__time_t st_ctime; /* file creation time */
__TIMEFILL( st_ctime)
char st_fab_rfm; /* record format */
char st_fab_rat; /* record attributes */
char st_fab_fsz; /* fixed header size */
char st_fabFill;
unsigned st_fab_mrs; /* record size */
blksize_t st_blksize; /* filesystem-specific preferred I/O block size for this file */
blkcnt_t st_blocks; /* number of blocks allocated for this file */
char st_reserved[sizeof(__int64)*__STD_STAT_QWORDS_RESERVED];
};
# pragma __member_alignment __restore
#else /* end of __USING_STD_STAT section */
We provide every field that POSIX defines plus a few more.
Close, but not quite. Specifically, not the timestamps. VMS has, for
time_t st_atime
struct timespec st_atim
No "e" at the end, and it's a struct that has both second and nanosecond
components. Most implementations would, for backward compatibility,
#define st_atime st_atim.tv_sec
But I have encountered code that used the timespec members directly and
of course didn't work out of the box on VMS.
At first glance it looks like the __TIMEFILL padding in the VMS stat
struct would allow you to overlay a timespec struct on top of a time_t
value in a binary compatible way as long as you are not using 64-bit
time_t (which I believe was never implemented); if it is ever
implemented, there will be no way to avoid binary-incompatible changes
to the stat struct if simultaneously changing the timestamps from time_t
to timespec.
Whether nanosecond precision is available from the file system is a
different question. But there would be source compatibility advantages
even if the nanosecond portion is rounded to the nearest 10-millisecond
tick or something.
Yikes. Totally missed that. I'll add that to the list.
On my nearby RHEL box, I do see:

#if defined __USE_MISC || defined __USE_XOPEN2K8
/* Nanosecond resolution timestamps are stored in a format
equivalent to 'struct timespec'. This is the type used
whenever possible but the Unix namespace rules do not allow the
identifier 'timespec' to appear in the <sys/stat.h> header.
Therefore we have to handle the use of this header in strictly
standard-compliant sources special. */
struct timespec st_atim; /* Time of last access. */
struct timespec st_mtim; /* Time of last modification. */
struct timespec st_ctim; /* Time of last status change. */
# define st_atime st_atim.tv_sec /* Backward compatibility. */
# define st_mtime st_mtim.tv_sec
# define st_ctime st_ctim.tv_sec
#else
__time_t st_atime; /* Time of last access. */
__syscall_ulong_t st_atimensec; /* Nscecs of last access. */
__time_t st_mtime; /* Time of last modification. */
__syscall_ulong_t st_mtimensec; /* Nsecs of last modification. */
__time_t st_ctime; /* Time of last status change. */
__syscall_ulong_t st_ctimensec; /* Nsecs of last status change. */
#endif
John Reagan
2021-05-20 19:25:19 UTC
Permalink
Post by Craig A. Berry
CRTL changes and the clang++ port seem orthogonal to me -- I don't see
them getting a new CRTL written in time for the C++ release that is
supposed to happen this quarter (or even if that slips by a quarter or two).
I've never seen a statement about a NEW CRTL. We've been adding enhancments
and bug fixes. Add clang into the mix involves adding some stuff to clang and adding
some "#ifndef __clang" to various headers. And since out-of-the-box clang is 64-bit
pointers AND 64-bit longs by default, RTL code for things like printf("%lx") become
interesting since the RTL thinks it knows the size of long. One of many things to sort.
Craig A. Berry
2021-05-20 19:57:52 UTC
Permalink
Post by John Reagan
Post by Craig A. Berry
CRTL changes and the clang++ port seem orthogonal to me -- I don't see
them getting a new CRTL written in time for the C++ release that is
supposed to happen this quarter (or even if that slips by a quarter or two).
I've never seen a statement about a NEW CRTL.
Well, then you're not reading all of Hoff's posts :-).
Post by John Reagan
We've been adding enhancments
and bug fixes. Add clang into the mix involves adding some stuff to clang and adding
some "#ifndef __clang" to various headers. And since out-of-the-box clang is 64-bit
pointers AND 64-bit longs by default, RTL code for things like printf("%lx") become
interesting since the RTL thinks it knows the size of long. One of many things to sort.
That's pretty much what I expected. The CRTL functions like getopt()
that don't have a 64-bit flavor will likely be fun.
John Reagan
2021-05-20 20:47:10 UTC
Permalink
Post by Craig A. Berry
Post by John Reagan
Post by Craig A. Berry
CRTL changes and the clang++ port seem orthogonal to me -- I don't see
them getting a new CRTL written in time for the C++ release that is
supposed to happen this quarter (or even if that slips by a quarter or two).
I've never seen a statement about a NEW CRTL.
Well, then you're not reading all of Hoff's posts :-).
Post by John Reagan
We've been adding enhancments
and bug fixes. Add clang into the mix involves adding some stuff to clang and adding
some "#ifndef __clang" to various headers. And since out-of-the-box clang is 64-bit
pointers AND 64-bit longs by default, RTL code for things like printf("%lx") become
interesting since the RTL thinks it knows the size of long. One of many things to sort.
That's pretty much what I expected. The CRTL functions like getopt()
that don't have a 64-bit flavor will likely be fun.
We've recently added a getopt64. Not in any ECO kit yet (there may be one in the pipe,
pardon the pun)

$ pipe anal/image/section=symbol_vector decc$shr.exe | search sys$pipe getopt
008B7468 00002868 1293. 0000000000A85C28 PROCEDURE 000000000037E650 "DECC$GETOPT"
008B7470 00002870 1294. 0000000000A85C28 PROCEDURE 000000000037E650 "decc$getopt"
008BBD40 00007140 3624. 0000000000A8B7D8 PROCEDURE 000000000037EBD0 "DECC$_GETOPT64"
008BBD48 00007148 3625. 0000000000A8B7D8 PROCEDURE 000000000037EBD0 "decc$_getopt64"
Richard Whalen
2021-05-21 12:53:20 UTC
Permalink
Post by Gregory Reut
Post by Richard Whalen
The SFTP/SCP file transfer implementation needs the file size before transferring the file. If the file is a VMS text file the file may need to be read to determine the actual length.
- convert the file to stream-lf, which is a native format for SFTP/SCP and will not require counting.
- store the file on an ods-5 disk and have the file length hint set.
Richard,
exactly which information makes SFTP/SCP decide 'it is a VMS text file' ?
Volker.
MultiNet/TCPware (and SSH for OpenVMS) use the
Richard Whalen
2021-05-21 12:57:59 UTC
Permalink
Post by Gregory Reut
Post by Richard Whalen
The SFTP/SCP file transfer implementation needs the file size before transferring the file. If the file is a VMS text file the file may need to be read to determine the actual length.
- convert the file to stream-lf, which is a native format for SFTP/SCP and will not require counting.
- store the file on an ods-5 disk and have the file length hint set.
Richard,
exactly which information makes SFTP/SCP decide 'it is a VMS text file' ?
Volker.
MultiNet/TCPware and SSH for OpenVMS use the file's Record Format and Record Attributes to determine the type of a file.
Though a backup save set may not be confused as a text file, copying it without putting it in a ZIP "container" will lose attributes that would need to be set to use the file.

Though SFTP/SCP will attempt to get the file length hint, I don't recall if it uses it for file size.

I don't know what TCP/IP Services does.
Arne Vajhøj
2021-05-19 12:38:31 UTC
Permalink
Post by Gregory Reut
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
I think there should be an option:

SCP /JUST_COPY_THE_BYTES_PLEASE ... ...

but in case there is not then maybe cut it into smaller chunks
like 2 GB at source system and combine back at target
system.

Arne
Gregory Reut
2021-05-19 18:00:18 UTC
Permalink
Hi Arne,

Unfortunately I didn't find any way to force scp to skip the parsing yet.
Regarding slicing the file - I'd definitely do that if I know how to do it :) Would you have a hint for me?

(By the way, thank you for the code parsing SAS files. I've used it to write a small python script to tread the files, it worked well. I'll post the scrip if anybody would be interested).

Best regards,
Greg
Post by Arne Vajhøj
Post by Gregory Reut
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
SCP /JUST_COPY_THE_BYTES_PLEASE ... ...
but in case there is not then maybe cut it into smaller chunks
like 2 GB at source system and combine back at target
system.
Arne
Volker Halle
2021-05-19 18:09:55 UTC
Permalink
Greg,

please see https://support.hpe.com/hpesc/public/docDisplay?docId=emr_na-c01474055

The above article includes the source code for SPLIT.COM (slow - used for floppies) and SPLIT.C (faster !).

Volker.
Chris Townley
2021-05-19 19:01:47 UTC
Permalink
Post by Volker Halle
Greg,
please see https://support.hpe.com/hpesc/public/docDisplay?docId=emr_na-c01474055
The above article includes the source code for SPLIT.COM (slow - used for floppies) and SPLIT.C (faster !).
Volker.
You can always get ZIP to split into multiple files - look at -s from
zip -h2
--
Chris Townley
Arne Vajhøj
2021-05-24 01:23:00 UTC
Permalink
Post by Gregory Reut
Post by Arne Vajhøj
Post by Gregory Reut
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
SCP /JUST_COPY_THE_BYTES_PLEASE ... ...
but in case there is not then maybe cut it into smaller chunks
like 2 GB at source system and combine back at target
system.
Unfortunately I didn't find any way to force scp to skip the parsing yet.
Regarding slicing the file - I'd definitely do that if I know how to
do it :) Would you have a hint for me?

There were a reference to a SPLIT.C - I have not used it myself,
but sound like exactly what you need.

Arne
Richard Whalen
2021-05-24 12:15:08 UTC
Permalink
Post by Gregory Reut
Post by Gregory Reut
Post by Arne Vajhøj
Post by Gregory Reut
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
SCP /JUST_COPY_THE_BYTES_PLEASE ... ...
but in case there is not then maybe cut it into smaller chunks
like 2 GB at source system and combine back at target
system.
Unfortunately I didn't find any way to force scp to skip the parsing yet.
Regarding slicing the file - I'd definitely do that if I know how to
do it :) Would you have a hint for me?
There were a reference to a SPLIT.C - I have not used it myself,
but sound like exactly what you need.
Arne
Even when the file doesn't need to be read the way that the SCP/SFTP2 code base that MultiNet, TCPware, SSH for OpenVMS will take longer to start transferring a large file than a small file because the code makes a list of "work items" to copy parts of the file. The code doesn't start transferring data until it has made work list to cover the entire file. This can use a lot of memory and time. You can reduce the amount of memory required for the work list, and the time required to create the work list by increasing the transfer buffer size.
Gregory Reut
2021-05-24 16:03:56 UTC
Permalink
Post by Gregory Reut
Post by Gregory Reut
Post by Arne Vajhøj
Post by Gregory Reut
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
SCP /JUST_COPY_THE_BYTES_PLEASE ... ...
but in case there is not then maybe cut it into smaller chunks
like 2 GB at source system and combine back at target
system.
Unfortunately I didn't find any way to force scp to skip the parsing yet.
Regarding slicing the file - I'd definitely do that if I know how to
do it :) Would you have a hint for me?
There were a reference to a SPLIT.C - I have not used it myself,
but sound like exactly what you need.
Arne
Even when the file doesn't need to be read the way that the SCP/SFTP2 code base that MultiNet, TCPware, SSH for OpenVMS will take longer to start transferring a large file than a small file because the code makes a list of "work items" to copy parts of the file. The code doesn't start transferring data until it has made work list to cover the entire file. This can use a lot of memory and time. You can reduce the amount of memory required for the work list, and the time required to create the work list by increasing the transfer buffer size.
Hi Richard,
Which buffer is that? Or, better, how to increase to and to what size?
Thanks in advance!

@the community - thank you very much for lots of ideas and clarifications, it is a big help!

Best regards,
Greg
Richard Whalen
2021-05-25 12:16:17 UTC
Permalink
Post by Gregory Reut
Post by Gregory Reut
Post by Gregory Reut
Post by Arne Vajhøj
Post by Gregory Reut
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
SCP /JUST_COPY_THE_BYTES_PLEASE ... ...
but in case there is not then maybe cut it into smaller chunks
like 2 GB at source system and combine back at target
system.
Unfortunately I didn't find any way to force scp to skip the parsing yet.
Regarding slicing the file - I'd definitely do that if I know how to
do it :) Would you have a hint for me?
There were a reference to a SPLIT.C - I have not used it myself,
but sound like exactly what you need.
Arne
Even when the file doesn't need to be read the way that the SCP/SFTP2 code base that MultiNet, TCPware, SSH for OpenVMS will take longer to start transferring a large file than a small file because the code makes a list of "work items" to copy parts of the file. The code doesn't start transferring data until it has made work list to cover the entire file. This can use a lot of memory and time. You can reduce the amount of memory required for the work list, and the time required to create the work list by increasing the transfer buffer size.
Hi Richard,
Which buffer is that? Or, better, how to increase to and to what size?
Thanks in advance!
@the community - thank you very much for lots of ideas and clarifications, it is a big help!
Best regards,
Greg
For MultiNet/TCPware/SSH for OpenVMS

On either the SCP2 or SFTP command line

/BUFFER_SIZE=integer

Number of bytes of data to transfer in a buffer. Default is 7500.
Jeffrey H. Coffield
2021-05-20 17:19:27 UTC
Permalink
Post by Gregory Reut
Hi all,
SCP Source file is "raw", and it needs to be parsed
Parsing a 20GB file takes longer than a standard scp timeout. and I need to scp files 10 times larger.
Perhaps there is a way to skip the parsing?
Thanks in advance!
Greg
If you are creating the save set, can you make a series of smaller save
sets? I tend to make a lot of little save sets like one per file or one
per directory so it's quicker to do a restore of a particular file
rather than having to restore a large save set just to get one file that
was deleted.

Some other posts suggested using zip but I believe zip has an upper
limit of ~2GB.

File attributes for a save set can be manually restored by using set
file/(attr=rfm:fix,lrl:32256) where the 32256 is obtained from a
dir/full of the save set looking at the record format. We have copied
save sets to a Linux box and then used bzip to compress them enough to
fit on a Blu-Ray disk and needed to do that to restore the save set.

Jeff Coffield
www.digitasynergyinc.com
Craig A. Berry
2021-05-20 17:25:28 UTC
Permalink
Some other posts suggested using zip but I believe zip has an  upper
limit of ~2GB.
Only for ancient versions of zip.
Stephen Hoffman
2021-05-20 17:32:36 UTC
Permalink
Post by Jeffrey H. Coffield
Some other posts suggested using zip but I believe zip has an upper
limit of ~2GB.
zip 3.0 and later and unzip 6.0 and later—which have been available for
a ~dozen years—have no such limits.

Now-archaic versions of zip and unzip do have addressing limits, and
those older versions can and do have other issues too.

zip "-V" (with the quotes, if extended parsing is not enabled) is the
easiest way to shovel around app files and data these days, and
particularly if working in heterogeneous networks.

If archiving OpenVMS apps and data, zip "-V" is also locally preferable
to BACKUP. Easier to access those archives from arbitrary systems than
to unpack a saveset—and yes, I know about vmsbackup.
--
Pure Personal Opinion | HoffmanLabs LLC
Phillip Helbig (undress to reply)
2021-05-20 18:34:39 UTC
Permalink
Post by Jeffrey H. Coffield
Some other posts suggested using zip but I believe zip has an upper
limit of ~2GB.
That limit was removed years, maybe more than a decade, ago in ZIP and
UNZIP.
Loading...