Discussion:
Text processing on VMS
(too old to reply)
David Meyer
2024-10-13 15:04:25 UTC
Permalink
I've got a text file with data that I want to select lines matching
certain character strings, then extract string values from the selected
lines by character position. On Unix, I would use awk or Perl. Does VMS
have a similar tool, should I use my favorite programming language and
call the STR$ RTL, can I write a TPU script to do this, or should I
transfer the file to a Unix box and user awk or Perl? ;)
--
David Meyer
Takarazuka, Japan
***@sdf.org
Chris Townley
2024-10-13 15:20:20 UTC
Permalink
Post by David Meyer
I've got a text file with data that I want to select lines matching
certain character strings, then extract string values from the selected
lines by character position. On Unix, I would use awk or Perl. Does VMS
have a similar tool, should I use my favorite programming language and
call the STR$ RTL, can I write a TPU script to do this, or should I
transfer the file to a Unix box and user awk or Perl? ;)
I have often done this in this manner:

Search file for matching records to second file.
Read through in a DCL loop, and use f$locate to find the string, then
f$extract to extract what I want, then write to a third file.
--
Chris
David Meyer
2024-10-14 01:23:23 UTC
Permalink
SEARCH/DCL/SORT for the VMS win!

Thanks for all suggestions. Highly educational.
--
David Meyer
Takarazuka, Japan
***@sdf.org
Craig A. Berry
2024-10-13 17:48:10 UTC
Permalink
Post by David Meyer
I've got a text file with data that I want to select lines matching
certain character strings, then extract string values from the selected
lines by character position. On Unix, I would use awk or Perl. Does VMS
have a similar tool, should I use my favorite programming language and
call the STR$ RTL, can I write a TPU script to do this, or should I
transfer the file to a Unix box and user awk or Perl? ;)
Perl is available. It's part of the base install on OpenVMS x86. For
Alpha and Itanium you can get an installer at:

https://vmssoftware.com/products/perl/

Python is also available. While what you want can be done with DCL or
TPU, that's generally more pain for less gain.
Arne Vajhøj
2024-10-13 18:39:48 UTC
Permalink
Post by David Meyer
I've got a text file with data that I want to select lines matching
certain character strings, then extract string values from the selected
lines by character position. On Unix, I would use awk or Perl. Does VMS
have a similar tool, should I use my favorite programming language and
call the STR$ RTL, can I write a TPU script to do this, or should I
transfer the file to a Unix box and user awk or Perl? ;)
Both Perl and gawk are available for VMS.

VSI distribute Perl - Alpha and Itanium here
https://vmssoftware.com/products/perl/ - x86-64 I believe comes with VMS

Gawk you can get from the net -
https://vms.process.com/scripts/fileserv/fileserv.com?GAWK

You can also use some other script language: Python, Groovy etc..

(I like Groovy)

A traditional VMS language (Cobol,Fortran,Basic,Pascal) and builtin
string functionality or STR$ calls will likely be much more code.

Arne
Dave Froble
2024-10-13 18:57:38 UTC
Permalink
Post by Arne Vajhøj
Post by David Meyer
I've got a text file with data that I want to select lines matching
certain character strings, then extract string values from the selected
lines by character position. On Unix, I would use awk or Perl. Does VMS
have a similar tool, should I use my favorite programming language and
call the STR$ RTL, can I write a TPU script to do this, or should I
transfer the file to a Unix box and user awk or Perl? ;)
Both Perl and gawk are available for VMS.
VSI distribute Perl - Alpha and Itanium here
https://vmssoftware.com/products/perl/ - x86-64 I believe comes with VMS
Gawk you can get from the net -
https://vms.process.com/scripts/fileserv/fileserv.com?GAWK
You can also use some other script language: Python, Groovy etc..
(I like Groovy)
A traditional VMS language (Cobol,Fortran,Basic,Pascal) and builtin
string functionality or STR$ calls will likely be much more code.
Arne
Using SEARCH and then a simple Basic program is not that much work.

For example:

SEARCH File1.txt "some text" /output=File2.txt

1 On Error Goto 90

10 Open "file2" For Input as File 1%
Open "File2" For Output as File 2%

20 Linput #1%, Z$
Print #2%, Mid(Z$,?,?)
Goto 20

90 GoTo 99 If ERR=11
On Error GoTo 0

99 End

Simple
No having to know whatever is your favorite utility
I seriously doubt there would be much fewer characters

No, I didn't try it ...
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Arne Vajhøj
2024-10-13 19:26:03 UTC
Permalink
Post by Dave Froble
Post by Arne Vajhøj
Post by David Meyer
I've got a text file with data that I want to select lines matching
certain character strings, then extract string values from the selected
lines by character position. On Unix, I would use awk or Perl. Does VMS
have a similar tool, should I use my favorite programming language and
call the STR$ RTL, can I write a TPU script to do this, or should I
transfer the file to a Unix box and user awk or Perl? ;)
Both Perl and gawk are available for VMS.
VSI distribute Perl - Alpha and Itanium here
https://vmssoftware.com/products/perl/ - x86-64 I believe comes with VMS
Gawk you can get from the net -
https://vms.process.com/scripts/fileserv/fileserv.com?GAWK
You can also use some other script language: Python, Groovy etc..
(I like Groovy)
A traditional VMS language (Cobol,Fortran,Basic,Pascal) and builtin
string functionality or STR$ calls will likely be much more code.
Using SEARCH and then a simple Basic program is not that much work.
SEARCH File1.txt "some text" /output=File2.txt
1    On Error Goto 90
10    Open "file2" For Input as File 1%
    Open "File2" For Output as File 2%
20    Linput #1%, Z$
    Print #2%, Mid(Z$,?,?)
    Goto 20
90    GoTo 99 If ERR=11
    On Error GoTo 0
99    End
Simple
No having to know whatever is your favorite utility
I seriously doubt there would be much fewer characters
No, I didn't try it ...
I have confidence in your VMS Basic skills.

:-)

A compound solution of SEARCH and a program is an option.

But in a relevant script language then it should be a one statement
problem (although in most cases splitting that one statement over
multiple lines is a good thing for readability).

import java.nio.file.*

Files.lines(Paths.get("login.com"))
.filter(line -> line.contains("java"))
.map(line -> line[2..12])
.forEach(System.out::println)

output pos 2..12 (pos is 0 based!) from all lines of login.com
that contains "java".

Arne
Craig A. Berry
2024-10-13 22:24:14 UTC
Permalink
Post by Arne Vajhøj
But in a relevant script language then it should be a one statement
problem (although in most cases splitting that one statement over
multiple lines is a good thing for readability).
import java.nio.file.*
Files.lines(Paths.get("login.com"))
     .filter(line -> line.contains("java"))
     .map(line -> line[2..12])
     .forEach(System.out::println)
output pos 2..12 (pos is 0 based!) from all lines of login.com
that contains "java".
Opening an editor, typing all that in, running the java compiler, and
then running the compiled program all seems like a lot of work to me
when all you need to do is:

$ perl -nE "say substr($_, 2, 12) if $_ =~ m/java/i;" < login.com
Arne Vajhøj
2024-10-14 00:14:21 UTC
Permalink
Post by Craig A. Berry
Post by Arne Vajhøj
But in a relevant script language then it should be a one statement
problem (although in most cases splitting that one statement over
multiple lines is a good thing for readability).
import java.nio.file.*
Files.lines(Paths.get("login.com"))
      .filter(line -> line.contains("java"))
      .map(line -> line[2..12])
      .forEach(System.out::println)
output pos 2..12 (pos is 0 based!) from all lines of login.com
that contains "java".
Opening an editor, typing all that in, running the java compiler, and
then running the compiled program all seems like a lot of work to me
$ perl -nE "say substr($_, 2, 12) if $_ =~ m/java/i;" < login.com
It is not Java but Groovy, so compile is optional.

And groovysh does have an -e for evaluating code given in command line
(it is just rarely used).

But you are absolutely right: Perl code is shorter than Groovy code.

Arne
Arne Vajhøj
2024-10-14 00:39:08 UTC
Permalink
Post by Arne Vajhøj
It is not Java but Groovy, so compile is optional.
The difference is all in the wrapping though.

$ type P.java
import java.nio.file.Files;
import java.nio.file.Paths;

public class P {
public static void main(String[] args) throws Exception {
Files.lines(Paths.get("login.com"))
.filter(line -> line.contains("java"))
.map(line -> line.substring(2, 13))
.forEach(System.out::println);
}
}
$ type s.groovy
import java.nio.file.*

Files.lines(Paths.get("login.com"))
.filter(line -> line.contains("java"))
.map(line -> line[2..12])
.forEach(System.out::println)
$ javac P.java
$ java P
...
$ groovy s.groovy
...

(I don't even have groovysh defined on VMS, so no easy way to try -e)

Arne
Arne Vajhøj
2024-10-14 00:51:12 UTC
Permalink
Post by Arne Vajhøj
$ type s.groovy
import java.nio.file.*
Files.lines(Paths.get("login.com"))
      .filter(line -> line.contains("java"))
      .map(line -> line[2..12])
      .forEach(System.out::println)
$ type s2.groovy
import java.nio.file.*
Files.lines(Paths.get("login.com"))
     .filter({ it.contains("java") })
     .map({ it[2..12] })
     .forEach({ println(it) })
But I don't think that improves readability.
Or:

$ type s3.groovy
import java.nio.file.*

Files.lines(Paths.get("login.com"))
.filter({ it.contains("java") })
.map({ it[2..12] })
.each(this.&println)

Arne
Arne Vajhøj
2024-10-14 00:47:11 UTC
Permalink
Post by Arne Vajhøj
$ type s.groovy
import java.nio.file.*
Files.lines(Paths.get("login.com"))
     .filter(line -> line.contains("java"))
     .map(line -> line[2..12])
     .forEach(System.out::println)
Note that it is probably more groovysk with:

$ type s2.groovy
import java.nio.file.*

Files.lines(Paths.get("login.com"))
.filter({ it.contains("java") })
.map({ it[2..12] })
.forEach({ println(it) })

But I don't think that improves readability.

Arne
Arne Vajhøj
2024-10-14 02:35:20 UTC
Permalink
Post by Arne Vajhøj
Post by Dave Froble
Using SEARCH and then a simple Basic program is not that much work.
SEARCH File1.txt "some text" /output=File2.txt
1    On Error Goto 90
10    Open "file2" For Input as File 1%
     Open "File2" For Output as File 2%
20    Linput #1%, Z$
     Print #2%, Mid(Z$,?,?)
     Goto 20
90    GoTo 99 If ERR=11
     On Error GoTo 0
99    End
Simple
No having to know whatever is your favorite utility
I seriously doubt there would be much fewer characters
No, I didn't try it ...
I have confidence in your VMS Basic skills.
But I am curious about how you iterate over the file.

Are there any benefits from this way compared to:

handler eof_handler
end handler
when error use eof_handler
while 1 = 1
get #1
! do whatever
next
end when

?

Arne
Simon Clubley
2024-10-14 12:30:19 UTC
Permalink
Post by Arne Vajhøj
Post by Arne Vajhøj
Post by Dave Froble
Using SEARCH and then a simple Basic program is not that much work.
SEARCH File1.txt "some text" /output=File2.txt
1    On Error Goto 90
10    Open "file2" For Input as File 1%
     Open "File2" For Output as File 2%
20    Linput #1%, Z$
     Print #2%, Mid(Z$,?,?)
     Goto 20
90    GoTo 99 If ERR=11
     On Error GoTo 0
99    End
Simple
No having to know whatever is your favorite utility
I seriously doubt there would be much fewer characters
No, I didn't try it ...
I have confidence in your VMS Basic skills.
But I am curious about how you iterate over the file.
handler eof_handler
end handler
when error use eof_handler
while 1 = 1
get #1
! do whatever
next
end when
That's how a Pascal programmer would write it. David however clearly
prefers Dartmouth Basic. :-)

BTW, I think your approach is a lot more readable than David's style. :-)

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Arne Vajhøj
2024-10-14 13:56:04 UTC
Permalink
Post by Simon Clubley
Post by Arne Vajhøj
Post by Arne Vajhøj
Post by Dave Froble
Using SEARCH and then a simple Basic program is not that much work.
SEARCH File1.txt "some text" /output=File2.txt
1    On Error Goto 90
10    Open "file2" For Input as File 1%
     Open "File2" For Output as File 2%
20    Linput #1%, Z$
     Print #2%, Mid(Z$,?,?)
     Goto 20
90    GoTo 99 If ERR=11
     On Error GoTo 0
99    End
Simple
No having to know whatever is your favorite utility
I seriously doubt there would be much fewer characters
No, I didn't try it ...
I have confidence in your VMS Basic skills.
But I am curious about how you iterate over the file.
handler eof_handler
end handler
when error use eof_handler
while 1 = 1
get #1
! do whatever
next
end when
That's how a Pascal programmer would write it. David however clearly
prefers Dartmouth Basic. :-)
BTW, I think your approach is a lot more readable than David's style. :-)
But I have not come up with that construct. I must have gotten
it from somewhere. Just not sure where.

BTW, I think it would be nice if the compiler wizard added
either:

while not eof #1
get #1
' do whatever
next

or:

while true
get #1, eof=100
' do whatever
next
100:

Arne

Dave Froble
2024-10-14 13:01:49 UTC
Permalink
Post by Arne Vajhøj
Post by Arne Vajhøj
Post by Dave Froble
Using SEARCH and then a simple Basic program is not that much work.
SEARCH File1.txt "some text" /output=File2.txt
1 On Error Goto 90
10 Open "file2" For Input as File 1%
Open "File2" For Output as File 2%
20 Linput #1%, Z$
Print #2%, Mid(Z$,?,?)
Goto 20
90 GoTo 99 If ERR=11
On Error GoTo 0
99 End
Simple
No having to know whatever is your favorite utility
I seriously doubt there would be much fewer characters
No, I didn't try it ...
I have confidence in your VMS Basic skills.
But I am curious about how you iterate over the file.
handler eof_handler
end handler
when error use eof_handler
while 1 = 1
get #1
! do whatever
next
end when
?
Arne
Yes, I like to keep things very simple, otherwise it hurts my brain ...

If I still have one, not sure ...
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Stephen Hoffman
2024-10-13 19:38:51 UTC
Permalink
Post by David Meyer
I've got a text file with data that I want to select lines matching
certain character strings, then extract string values from the selected
lines by character position. On Unix, I would use awk or Perl. Does VMS
have a similar tool, should I use my favorite programming language and
call the STR$ RTL, can I write a TPU script to do this, or should I
transfer the file to a Unix box and user awk or Perl? ;)
One-shot task?

Haul it over to an existing and working Unix, and "awk to it" there.
It'll be easier, generally. Easier, particularly if the text file
contains UTF-8, though this file is from OpenVMS so probably not.

For production use?

Other folks have listed various options to perform this task on OpenVMS.

Another option that seemingly wasn't mentioned is installing and using
GNV on OpenVMS. Obligatory vim and emacs reference.

The least-accretive and easiest-to-hand-off-to-others option on OpenVMS
is approximately a DCL LOOP: / READ / WRITE / GOTO LOOP procedure.
Which is a slog if unfamiliar with DCL, but is entirely doable.

If this file has some sort of internal organization or syntax, use of a
lib$table_parse grammar can be an option.
--
Pure Personal Opinion | HoffmanLabs LLC
Loading...