Discussion:
State of the Port to x86_64 January 2017
(too old to reply)
Neil Rieck
2017-01-10 19:35:05 UTC
Permalink
Raw Message
This hit my INBOX today from Sue Skonetski at VSI. Enjoy!

http://vmssoftware.com/pdfs/State_of_Port_20170105.pdf

Neil Rieck
Waterloo, Ontario, Canada.
http://www3.sympatico.ca/n.rieck/
u***@gmail.com
2017-01-11 15:48:55 UTC
Permalink
Raw Message
Post by Neil Rieck
This hit my INBOX today from Sue Skonetski at VSI. Enjoy!
http://vmssoftware.com/pdfs/State_of_Port_20170105.pdf
Neil Rieck
Waterloo, Ontario, Canada.
http://www3.sympatico.ca/n.rieck/
nice to see Sue still with OpenVMS. I wish the developers
would get away from the c code as this always causes problems
later.
Robert A. Brooks
2017-01-11 16:59:54 UTC
Permalink
Raw Message
Post by u***@gmail.com
Post by Neil Rieck
This hit my INBOX today from Sue Skonetski at VSI. Enjoy!
http://vmssoftware.com/pdfs/State_of_Port_20170105.pdf
nice to see Sue still with OpenVMS. I wish the developers
would get away from the c code as this always causes problems
later.
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.

Anyway, my retirement plan is predicated on rewriting the shadowing
driver in C, thereby earning the Reagan Bucks for MACRO-->C conversions.
--
-- Rob
Stephen Hoffman
2017-01-11 17:53:18 UTC
Permalink
Raw Message
Post by Robert A. Brooks
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Post the source code and maybe somebody will run some scanning tools
(e.g. Klee) and fuzzers (e.g. AFL)?

C undefined behavior — and I've been programming in C for longer than I
care to admit — can be subtle and pernicious.
Post by Robert A. Brooks
Anyway, my retirement plan is predicated on rewriting the shadowing
driver in C, thereby earning the Reagan Bucks for MACRO-->C conversions.
John hasn't offered you with that Macro32 to C translator tool he's
been working on? I swiped a copy from his development directory a
while back, and it works nicely. 😉
--
Pure Personal Opinion | HoffmanLabs LLC
Bill Gunshannon
2017-01-12 02:41:10 UTC
Permalink
Raw Message
Post by Stephen Hoffman
Post by Robert A. Brooks
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Post the source code and maybe somebody will run some scanning tools
(e.g. Klee) and fuzzers (e.g. AFL)?
C undefined behavior — and I've been programming in C for longer than I
care to admit — can be subtle and pernicious.
C undefined behaviour? Is that like Ada Multitasking. Handled totally
at the discretion of the implementer. Run synchronous on one machine
and asynchronous on another and both meet the spec.

bill
Arne Vajhøj
2017-01-12 02:54:21 UTC
Permalink
Raw Message
Post by Bill Gunshannon
Post by Stephen Hoffman
Post by Robert A. Brooks
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Post the source code and maybe somebody will run some scanning tools
(e.g. Klee) and fuzzers (e.g. AFL)?
C undefined behavior — and I've been programming in C for longer than I
care to admit — can be subtle and pernicious.
C undefined behaviour? Is that like Ada Multitasking. Handled totally
at the discretion of the implementer. Run synchronous on one machine
and asynchronous on another and both meet the spec.
Undefined behavior = anything can happen

Implementation specific behavior = each vendor document what will happen
in their environment but it does not need to be the same

Arne
Stephen Hoffman
2017-01-12 17:14:57 UTC
Permalink
Raw Message
Post by Arne Vajhøj
Post by Bill Gunshannon
Post by Stephen Hoffman
Post by Robert A. Brooks
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Post the source code and maybe somebody will run some scanning tools
(e.g. Klee) and fuzzers (e.g. AFL)?
C undefined behavior — and I've been programming in C for longer than I
care to admit — can be subtle and pernicious.
C undefined behaviour? Is that like Ada Multitasking. Handled totally
at the discretion of the implementer. Run synchronous on one machine
and asynchronous on another and both meet the spec.
Undefined behavior = anything can happen
Implementation specific behavior = each vendor document what will
happen in their environment but it does not need to be the same
Undefined behavior — sometimes known as nasal dragons — is a problem in
C, where the compiler writers and particularly the code optimizers have
sufficient latitude to cause problems for developers that stray onto
something undefined. (The LLVM folks wrote up some notes on some of
that a while back, too. See below.)

I've re-learned C several times now since starting back in the K&R era,
and am due to review and relearn and rethink it yet again in a few
years— and I use it daily, and for some large projects. This between
learning more about undefined behavior, about newer C standards,
libraries and frameworks available elsewhere — and Objective C and its
frameworks completely blows the sneakers off of C for application-level
work, but I digress — and — in the case of OpenVMS — learning more
about the many then-new new-to-OpenVMS C routines that arrived in the C
RTL during the latter releases in the V7 range. The VSI folks have
accrued lists of C routines requested, too.

Having to back-port my C code to run on OpenVMS with its sorta-C99 is
no fun, either.

Fuzzers and analysis tools routinely find problems in long-running
code, too. That irrespective of the implementation language, too.

Programming is becoming far more automated too, though that's not
(yet?) become common with (most of) the OpenVMS development projects
I've peeked in on. With other platforms, that automation includes
IDEs, formatters, static analyzers, fuzzers, tools that scan for
potential security bugs (there's a nice one that searches for hashed
passwords and public and private keys embedded in the source code,
too), some really handy automated testing frameworks and servers, DVCS,
and many other tools. Tools that are available now,
platform-integrated of course, and variously free from the platform
vendor and with good commercial offerings from third-parties, too.
Yes, on some platforms I use IDEs now. They work, they make the
development work faster, and they're more efficient than
edit-compile-link-test.

If you're doing C development and are doing your development in the
same way with the same tools and the same calls as you've done for a
decade or more, maybe look around anew? Wouldn't want to go and get
all Rust-y, eh?

Some material to ponder, for C programmers...
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
http://shop.oreilly.com/product/0636920025108.do
http://icube-icps.unistra.fr/index.php/Jens_Gustedt
https://www.sans.org/top25-software-errors/
--
Pure Personal Opinion | HoffmanLabs LLC
John Reagan
2017-01-12 19:03:49 UTC
Permalink
Raw Message
Post by Stephen Hoffman
Post by Arne Vajhøj
Post by Bill Gunshannon
Post by Stephen Hoffman
Post by Robert A. Brooks
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Post the source code and maybe somebody will run some scanning tools
(e.g. Klee) and fuzzers (e.g. AFL)?
C undefined behavior — and I've been programming in C for longer than I
care to admit — can be subtle and pernicious.
C undefined behaviour? Is that like Ada Multitasking. Handled totally
at the discretion of the implementer. Run synchronous on one machine
and asynchronous on another and both meet the spec.
Undefined behavior = anything can happen
Implementation specific behavior = each vendor document what will
happen in their environment but it does not need to be the same
Undefined behavior — sometimes known as nasal dragons — is a problem in
C, where the compiler writers and particularly the code optimizers have
sufficient latitude to cause problems for developers that stray onto
something undefined. (The LLVM folks wrote up some notes on some of
that a while back, too. See below.)
I've re-learned C several times now since starting back in the K&R era,
and am due to review and relearn and rethink it yet again in a few
years— and I use it daily, and for some large projects. This between
learning more about undefined behavior, about newer C standards,
libraries and frameworks available elsewhere — and Objective C and its
frameworks completely blows the sneakers off of C for application-level
work, but I digress — and — in the case of OpenVMS — learning more
about the many then-new new-to-OpenVMS C routines that arrived in the C
RTL during the latter releases in the V7 range. The VSI folks have
accrued lists of C routines requested, too.
Having to back-port my C code to run on OpenVMS with its sorta-C99 is
no fun, either.
Fuzzers and analysis tools routinely find problems in long-running
code, too. That irrespective of the implementation language, too.
Programming is becoming far more automated too, though that's not
(yet?) become common with (most of) the OpenVMS development projects
I've peeked in on. With other platforms, that automation includes
IDEs, formatters, static analyzers, fuzzers, tools that scan for
potential security bugs (there's a nice one that searches for hashed
passwords and public and private keys embedded in the source code,
too), some really handy automated testing frameworks and servers, DVCS,
and many other tools. Tools that are available now,
platform-integrated of course, and variously free from the platform
vendor and with good commercial offerings from third-parties, too.
Yes, on some platforms I use IDEs now. They work, they make the
development work faster, and they're more efficient than
edit-compile-link-test.
If you're doing C development and are doing your development in the
same way with the same tools and the same calls as you've done for a
decade or more, maybe look around anew? Wouldn't want to go and get
all Rust-y, eh?
Some material to ponder, for C programmers...
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
http://shop.oreilly.com/product/0636920025108.do
http://icube-icps.unistra.fr/index.php/Jens_Gustedt
https://www.sans.org/top25-software-errors/
--
Pure Personal Opinion | HoffmanLabs LLC
Rust is no different than other languages. They have an entire book dedicated to writing "correct unsafe Rust" code.


https://doc.rust-lang.org/nomicon/

This book digs into all the awful details that are necessary to understand in order to write correct Unsafe Rust programs. Due to the nature of this problem, it may lead to unleashing untold horrors that shatter your psyche into a billion infinitesimal fragments of despair.

Should you wish a long and happy career of writing Rust programs, you should turn back now and forget you ever saw this book. It is not necessary. However if you intend to write unsafe code -- or just want to dig into the guts of the language -- this book contains invaluable information.
John Reagan
2017-01-12 19:08:57 UTC
Permalink
Raw Message
Post by John Reagan
Post by Stephen Hoffman
Post by Arne Vajhøj
Post by Bill Gunshannon
Post by Stephen Hoffman
Post by Robert A. Brooks
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Post the source code and maybe somebody will run some scanning tools
(e.g. Klee) and fuzzers (e.g. AFL)?
C undefined behavior — and I've been programming in C for longer than I
care to admit — can be subtle and pernicious.
C undefined behaviour? Is that like Ada Multitasking. Handled totally
at the discretion of the implementer. Run synchronous on one machine
and asynchronous on another and both meet the spec.
Undefined behavior = anything can happen
Implementation specific behavior = each vendor document what will
happen in their environment but it does not need to be the same
Undefined behavior — sometimes known as nasal dragons — is a problem in
C, where the compiler writers and particularly the code optimizers have
sufficient latitude to cause problems for developers that stray onto
something undefined. (The LLVM folks wrote up some notes on some of
that a while back, too. See below.)
I've re-learned C several times now since starting back in the K&R era,
and am due to review and relearn and rethink it yet again in a few
years— and I use it daily, and for some large projects. This between
learning more about undefined behavior, about newer C standards,
libraries and frameworks available elsewhere — and Objective C and its
frameworks completely blows the sneakers off of C for application-level
work, but I digress — and — in the case of OpenVMS — learning more
about the many then-new new-to-OpenVMS C routines that arrived in the C
RTL during the latter releases in the V7 range. The VSI folks have
accrued lists of C routines requested, too.
Having to back-port my C code to run on OpenVMS with its sorta-C99 is
no fun, either.
Fuzzers and analysis tools routinely find problems in long-running
code, too. That irrespective of the implementation language, too.
Programming is becoming far more automated too, though that's not
(yet?) become common with (most of) the OpenVMS development projects
I've peeked in on. With other platforms, that automation includes
IDEs, formatters, static analyzers, fuzzers, tools that scan for
potential security bugs (there's a nice one that searches for hashed
passwords and public and private keys embedded in the source code,
too), some really handy automated testing frameworks and servers, DVCS,
and many other tools. Tools that are available now,
platform-integrated of course, and variously free from the platform
vendor and with good commercial offerings from third-parties, too.
Yes, on some platforms I use IDEs now. They work, they make the
development work faster, and they're more efficient than
edit-compile-link-test.
If you're doing C development and are doing your development in the
same way with the same tools and the same calls as you've done for a
decade or more, maybe look around anew? Wouldn't want to go and get
all Rust-y, eh?
Some material to ponder, for C programmers...
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
http://shop.oreilly.com/product/0636920025108.do
http://icube-icps.unistra.fr/index.php/Jens_Gustedt
https://www.sans.org/top25-software-errors/
--
Pure Personal Opinion | HoffmanLabs LLC
Rust is no different than other languages. They have an entire book dedicated to writing "correct unsafe Rust" code.
https://doc.rust-lang.org/nomicon/
This book digs into all the awful details that are necessary to understand in order to write correct Unsafe Rust programs. Due to the nature of this problem, it may lead to unleashing untold horrors that shatter your psyche into a billion infinitesimal fragments of despair.
Should you wish a long and happy career of writing Rust programs, you should turn back now and forget you ever saw this book. It is not necessary. However if you intend to write unsafe code -- or just want to dig into the guts of the language -- this book contains invaluable information.
I should add that to get Rust to allow you screw yourself, you have to use the "unsafe" keyword that says "I'm smart" or "Hold my beer and watch this!" This isn't much different that Ada's UNCHECKED_CONVERSION. It lets you easily search your source code to find all the places where you might lie to the compiler (and to yourself)
Bill Gunshannon
2017-01-12 19:56:17 UTC
Permalink
Raw Message
Post by John Reagan
Post by Stephen Hoffman
Post by Arne Vajhøj
Post by Bill Gunshannon
Post by Stephen Hoffman
Post by Robert A. Brooks
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Post the source code and maybe somebody will run some scanning tools
(e.g. Klee) and fuzzers (e.g. AFL)?
C undefined behavior — and I've been programming in C for longer than I
care to admit — can be subtle and pernicious.
C undefined behaviour? Is that like Ada Multitasking. Handled totally
at the discretion of the implementer. Run synchronous on one machine
and asynchronous on another and both meet the spec.
Undefined behavior = anything can happen
Implementation specific behavior = each vendor document what will
happen in their environment but it does not need to be the same
Undefined behavior — sometimes known as nasal dragons — is a problem in
C, where the compiler writers and particularly the code optimizers have
sufficient latitude to cause problems for developers that stray onto
something undefined. (The LLVM folks wrote up some notes on some of
that a while back, too. See below.)
I've re-learned C several times now since starting back in the K&R era,
and am due to review and relearn and rethink it yet again in a few
years— and I use it daily, and for some large projects. This between
learning more about undefined behavior, about newer C standards,
libraries and frameworks available elsewhere — and Objective C and its
frameworks completely blows the sneakers off of C for application-level
work, but I digress — and — in the case of OpenVMS — learning more
about the many then-new new-to-OpenVMS C routines that arrived in the C
RTL during the latter releases in the V7 range. The VSI folks have
accrued lists of C routines requested, too.
Having to back-port my C code to run on OpenVMS with its sorta-C99 is
no fun, either.
Fuzzers and analysis tools routinely find problems in long-running
code, too. That irrespective of the implementation language, too.
Programming is becoming far more automated too, though that's not
(yet?) become common with (most of) the OpenVMS development projects
I've peeked in on. With other platforms, that automation includes
IDEs, formatters, static analyzers, fuzzers, tools that scan for
potential security bugs (there's a nice one that searches for hashed
passwords and public and private keys embedded in the source code,
too), some really handy automated testing frameworks and servers, DVCS,
and many other tools. Tools that are available now,
platform-integrated of course, and variously free from the platform
vendor and with good commercial offerings from third-parties, too.
Yes, on some platforms I use IDEs now. They work, they make the
development work faster, and they're more efficient than
edit-compile-link-test.
If you're doing C development and are doing your development in the
same way with the same tools and the same calls as you've done for a
decade or more, maybe look around anew? Wouldn't want to go and get
all Rust-y, eh?
Some material to ponder, for C programmers...
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
http://shop.oreilly.com/product/0636920025108.do
http://icube-icps.unistra.fr/index.php/Jens_Gustedt
https://www.sans.org/top25-software-errors/
--
Pure Personal Opinion | HoffmanLabs LLC
Rust is no different than other languages. They have an entire book dedicated to writing "correct unsafe Rust" code.
https://doc.rust-lang.org/nomicon/
This book digs into all the awful details that are necessary to understand in order to write correct Unsafe Rust programs. Due to the nature of this problem, it may lead to unleashing untold horrors that shatter your psyche into a billion infinitesimal fragments of despair.
Should you wish a long and happy career of writing Rust programs, you should turn back now and forget you ever saw this book. It is not necessary. However if you intend to write unsafe code -- or just want to dig into the guts of the language -- this book contains invaluable information.
Ada was no different in the end. Somewhere I have a text book that
devotes an entire chapter to getting around strong typing and all
the other Cisms people here are always railing against. One can write
bad code with any language. Relying on the language to save your
bacon is never a good idea.

bill
Arne Vajhøj
2017-01-13 00:07:10 UTC
Permalink
Raw Message
Post by Bill Gunshannon
Post by John Reagan
Rust is no different than other languages. They have an entire book
dedicated to writing "correct unsafe Rust" code.
https://doc.rust-lang.org/nomicon/
This book digs into all the awful details that are necessary to
understand in order to write correct Unsafe Rust programs. Due to the
nature of this problem, it may lead to unleashing untold horrors that
shatter your psyche into a billion infinitesimal fragments of despair.
Should you wish a long and happy career of writing Rust programs, you
should turn back now and forget you ever saw this book. It is not
necessary. However if you intend to write unsafe code -- or just want
to dig into the guts of the language -- this book contains invaluable
information.
Ada was no different in the end. Somewhere I have a text book that
devotes an entire chapter to getting around strong typing and all
the other Cisms people here are always railing against. One can write
bad code with any language. Relying on the language to save your
bacon is never a good idea.
It is possible to write bad code in any language.

But all languages are not equal in that regard.

Some languages makes it hard to do it.

Other languages makes it easy to do it.

Arne
Simon Clubley
2017-01-13 13:46:41 UTC
Permalink
Raw Message
Post by Arne Vajhøj
It is possible to write bad code in any language.
But all languages are not equal in that regard.
Some languages makes it hard to do it.
And some languages (ie: Ada) make it explicit in the code when
you are doing dodgy things. This makes it easy to apply more
attention to those areas during code review and to have another
think about if this is really required..

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Arne Vajhøj
2017-01-13 14:18:55 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by Arne Vajhøj
It is possible to write bad code in any language.
But all languages are not equal in that regard.
Some languages makes it hard to do it.
And some languages (ie: Ada) make it explicit in the code when
you are doing dodgy things. This makes it easy to apply more
attention to those areas during code review and to have another
think about if this is really required..
Yes.

Likewise Rust or C# unsafe blocks.

Arne
Arne Vajhøj
2017-01-13 00:11:39 UTC
Permalink
Raw Message
Post by John Reagan
Rust is no different than other languages. They have an entire book
dedicated to writing "correct unsafe Rust" code.
https://doc.rust-lang.org/nomicon/
This book digs into all the awful details that are necessary to
understand in order to write correct Unsafe Rust programs. Due to the
nature of this problem, it may lead to unleashing untold horrors that
shatter your psyche into a billion infinitesimal fragments of
despair.
Should you wish a long and happy career of writing Rust programs, you
should turn back now and forget you ever saw this book. It is not
necessary. However if you intend to write unsafe code -- or just want
to dig into the guts of the language -- this book contains invaluable
information.
For certain types of programming unsafe programming style may be
needed to do the task in an efficient way.

But the unsafe block concept (which I believe was first introduced
in C#) is a great improvement IMHO.

It makes it a lot easier to identify the risky portions of the code.

Arne
Louis Krupp
2017-01-13 07:23:25 UTC
Permalink
Raw Message
Post by Arne Vajhøj
Post by John Reagan
Rust is no different than other languages. They have an entire book
dedicated to writing "correct unsafe Rust" code.
https://doc.rust-lang.org/nomicon/
This book digs into all the awful details that are necessary to
understand in order to write correct Unsafe Rust programs. Due to the
nature of this problem, it may lead to unleashing untold horrors that
shatter your psyche into a billion infinitesimal fragments of
despair.
Should you wish a long and happy career of writing Rust programs, you
should turn back now and forget you ever saw this book. It is not
necessary. However if you intend to write unsafe code -- or just want
to dig into the guts of the language -- this book contains invaluable
information.
For certain types of programming unsafe programming style may be
needed to do the task in an efficient way.
But the unsafe block concept (which I believe was first introduced
in C#) is a great improvement IMHO.
For what it's worth, on one Unisys product line, the operating system
is called the Master Control Program, and its ALGOL-like
implementation language, NEWP, has a block directive named UNSAFE that
seems to serve the same purpose as the "unsafe" blocks under
discussion. The language and the directive date back to the early
1980s, when the products were known as Burroughs Large Systems.

See:


https://public.support.unisys.com/aseries/docs/ClearPath-MCP-15.0/PDF/86002003-406.pdf

The Large Systems line was introduced in about 1970, and the MCP was
written in a language called ESPOL, many features of which were
spectacularly unsafe all the time.

"ESPOL" stood for something like "Executive Systems Programming
Oriented Language." The meaning and origin of "NEWP" are a subject of
speculation.
Post by Arne Vajhøj
It makes it a lot easier to identify the risky portions of the code.
Arne
Louis
Arne Vajhøj
2017-01-13 14:20:48 UTC
Permalink
Raw Message
Post by Louis Krupp
Post by Arne Vajhøj
Post by John Reagan
Rust is no different than other languages. They have an entire book
dedicated to writing "correct unsafe Rust" code.
https://doc.rust-lang.org/nomicon/
But the unsafe block concept (which I believe was first introduced
in C#) is a great improvement IMHO.
For what it's worth, on one Unisys product line, the operating system
is called the Master Control Program, and its ALGOL-like
implementation language, NEWP, has a block directive named UNSAFE that
seems to serve the same purpose as the "unsafe" blocks under
discussion. The language and the directive date back to the early
1980s, when the products were known as Burroughs Large Systems.
So Anders Hejlsberg also got it from somewhere.

:-)

Arne
Hans Vlems
2017-01-13 19:03:54 UTC
Permalink
Raw Message
Louis, I wrote a similar comment and it was ignored. Possibly because MCP systems are thought extinct or may be Algol is not as exciting as Rust. I mean anyone can easily read Algol (and NEWP) while C derivatives offer more options for very learned discussions on language semantics...
Hans
Simon Clubley
2017-01-13 20:54:47 UTC
Permalink
Raw Message
Post by Hans Vlems
I mean anyone can easily read Algol (and NEWP) while C derivatives
offer more options for very learned discussions on language semantics...
ROTFL (in a good natured way. :-))

You want learned discussions on language semantics ? :-) :-)

Ok, well here you go:

http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0127-1.txt

What you are looking at is the change log for an Ada Issue which
I submitted to the ARG in the middle of 2014. (The ARG is the group
charged with handling issues and suggested enhancements to Ada
before submitting them to the ISO WG9 committee to be ratified.)

The direct link to the latest reading version is:

http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0127-1.txt?rev=1.10&raw=N

In the grand scheme of things my proposal is a relatively small
enhancement to the next version of Ada (my original proposal is
in the appendix section of the AI) but you can see the level of
detail and issues raised by even this small proposal.

What you are seeing in the above emails is also effectively only
the end result of the various discussions at the ARG meetings at
which the outstanding AIs are discussed.

Trust me Hans, it's not only the C style languages which have
the language lawyers. :-)

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Hans Vlems
2017-01-14 08:41:18 UTC
Permalink
Raw Message
I've read the articles however my understanding of Ada is inadequate to even understand the proposal !
When discussions on language semantics arrive at this level what does that imply for the source code quality produced by Joe Average I wonder.
Perhaps deep down I'm an assembler programmer after all? :-)
Hans
Simon Clubley
2017-01-14 18:43:05 UTC
Permalink
Raw Message
Post by Hans Vlems
I've read the articles however my understanding of Ada is inadequate
to even understand the proposal !
In fairness, I should point out that Ada comes with it's own
terminology, as many ecosystems do, and it's easy to get lost if you
don't know the terminology.

In summary:

The general idea is that I want to be able to model a device register
as a series of bitfields instead of C style masks and to be able to
update multiple bitfields at the same time in a single Read/Modify/Write
operation without having to use a temporary variable (which is ugly).
During the update the contents of the other unreferenced bitfields
must be preserved by the generated code.

The formal submission and subsequent ARG work extended this idea into
the general update of composite types after I received feedback about
my original ideas and how they could be applied to other situations.
Post by Hans Vlems
When discussions on language semantics arrive at this level what
does that imply for the source code quality produced by Joe Average
I wonder.
If the language maintainers do their job correctly then Mr/Mrs Average
should never have to worry about some of the issues raised by the time
they get to use the new feature. :-)
Post by Hans Vlems
Perhaps deep down I'm an assembler programmer after all? :-)
Not at all. It's just that the language used is usually somewhat
formal in some situations like this one. :-)

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Hans Vlems
2017-01-15 08:48:10 UTC
Permalink
Raw Message
I'm still not sure that I understand the objective. If it has to do with accessing individual bits in a register (or variable) then look at the way Algol (Burroughs derivative that is) solved it:
integer i;
i.[2:1]:=1;
Where [B:L] represents B the starting bit (in the range 0 .. 47 for Burroughs) and L the number of bits involved, where B is the most significant bit in the series.
The code snippet leaves 4 in i.
Hans
Johnny Billquist
2017-01-15 15:05:11 UTC
Permalink
Raw Message
Post by Hans Vlems
integer i;
i.[2:1]:=1;
Where [B:L] represents B the starting bit (in the range 0 .. 47 for Burroughs) and L the number of bits involved, where B is the most significant bit in the series.
The code snippet leaves 4 in i.
I think Simons point was about updating several fields in one operation.

In C, I could express the problem this way:

struct CSR {
func: 11;
mod: 4;
go: 1;
};

volatile struct CSR foo;

.
.

foo.func = 1;
foo.go = 1;



Now, when you modify func, the CSR will be updated immediately with that
value, leaving all other bits alone. However, the actual access will be
to the full 16 bit always, because of how the hardware works.
The writing of a 1 into the go bit, will the cause another read and
write of the CSR, to just modify that one bit.
What you might want is some way to say that you want both these
modifications of the CSR register to happen simultaneously. You want to
write a 1 into the func bits, and set the go bit.

In assembler, you'd do something like:

MOV #<1*FUNCSHIFT>!GOBIT,@#CSR

or possibly:

BIS #<1*FUNCSHIFT>!GOBIT,@#CSR


bit, if you have bitfields in your high level language, you obvioysly
want to use that instead of playing with this computational stuff to
work out what the value to write to the CSR register. And then you get
into these separate statement issues for changing the different
bitfields. And then you also get into the problems with how/when to
read/write the CSR register as well. In C you have the volatile keyword,
you you pretty much do need to use for anything connected to hardware,
but that also means the compiler are not allowed to optimize any
accesses and references to the memory. So it becomes a mess.

Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
David Froble
2017-01-15 19:24:13 UTC
Permalink
Raw Message
Post by Johnny Billquist
Post by Hans Vlems
I'm still not sure that I understand the objective. If it has to do
with accessing individual bits in a register (or variable) then look
integer i;
i.[2:1]:=1;
Where [B:L] represents B the starting bit (in the range 0 .. 47 for
Burroughs) and L the number of bits involved, where B is the most
significant bit in the series.
The code snippet leaves 4 in i.
I think Simons point was about updating several fields in one operation.
struct CSR {
func: 11;
mod: 4;
go: 1;
};
volatile struct CSR foo;
.
.
foo.func = 1;
foo.go = 1;
Now, when you modify func, the CSR will be updated immediately with that
value, leaving all other bits alone. However, the actual access will be
to the full 16 bit always, because of how the hardware works.
The writing of a 1 into the go bit, will the cause another read and
write of the CSR, to just modify that one bit.
What you might want is some way to say that you want both these
modifications of the CSR register to happen simultaneously. You want to
write a 1 into the func bits, and set the go bit.
bit, if you have bitfields in your high level language, you obvioysly
want to use that instead of playing with this computational stuff to
work out what the value to write to the CSR register. And then you get
into these separate statement issues for changing the different
bitfields. And then you also get into the problems with how/when to
read/write the CSR register as well. In C you have the volatile keyword,
you you pretty much do need to use for anything connected to hardware,
but that also means the compiler are not allowed to optimize any
accesses and references to the memory. So it becomes a mess.
Johnny
Well, if I understand what you're trying to do, in Basic it might be simpler?

z% = 1% + 4% + 16% ! to set 3 bits, the rest zero

or (sic)

z% = z% or ( 1% + 4% + 16% ) ! to set the bits, leaving the rest alone

:-)
j***@yahoo.co.uk
2017-01-15 19:49:34 UTC
Permalink
Raw Message
Post by David Froble
Post by Johnny Billquist
Post by Hans Vlems
I'm still not sure that I understand the objective. If it has to do
with accessing individual bits in a register (or variable) then look
integer i;
i.[2:1]:=1;
Where [B:L] represents B the starting bit (in the range 0 .. 47 for
Burroughs) and L the number of bits involved, where B is the most
significant bit in the series.
The code snippet leaves 4 in i.
I think Simons point was about updating several fields in one operation.
struct CSR {
func: 11;
mod: 4;
go: 1;
};
volatile struct CSR foo;
.
.
foo.func = 1;
foo.go = 1;
Now, when you modify func, the CSR will be updated immediately with that
value, leaving all other bits alone. However, the actual access will be
to the full 16 bit always, because of how the hardware works.
The writing of a 1 into the go bit, will the cause another read and
write of the CSR, to just modify that one bit.
What you might want is some way to say that you want both these
modifications of the CSR register to happen simultaneously. You want to
write a 1 into the func bits, and set the go bit.
bit, if you have bitfields in your high level language, you obvioysly
want to use that instead of playing with this computational stuff to
work out what the value to write to the CSR register. And then you get
into these separate statement issues for changing the different
bitfields. And then you also get into the problems with how/when to
read/write the CSR register as well. In C you have the volatile keyword,
you you pretty much do need to use for anything connected to hardware,
but that also means the compiler are not allowed to optimize any
accesses and references to the memory. So it becomes a mess.
Johnny
Well, if I understand what you're trying to do, in Basic it might be simpler?
z% = 1% + 4% + 16% ! to set 3 bits, the rest zero
or (sic)
z% = z% or ( 1% + 4% + 16% ) ! to set the bits, leaving the rest alone
:-)
I think it goes further than that. Johnny suggests one example
which works fine with memory-like data but which will fail where
reads (or writes) to a given item of data cause side effects. The
classic example is the reading of a register causing the contents
of that register to change.

Anyone who's played with software that reads old school PDP11
hardware will understand this. Others may find it troubling.

It's a hard task, and possibly near impossible to address in the
general case - but it was one of the first things I tried when
Whitesmiths C arrived on my employers PDP11s. If I remember
rightly they got the obvious cases right, which was nice.
Whether that was by accident or design is unknown to me.
Johnny Billquist
2017-01-15 21:49:17 UTC
Permalink
Raw Message
Post by David Froble
Post by Johnny Billquist
I think Simons point was about updating several fields in one operation.
struct CSR {
func: 11;
mod: 4;
go: 1;
};
volatile struct CSR foo;
.
.
foo.func = 1;
foo.go = 1;
Now, when you modify func, the CSR will be updated immediately with
that value, leaving all other bits alone. However, the actual access
will be to the full 16 bit always, because of how the hardware works.
The writing of a 1 into the go bit, will the cause another read and
write of the CSR, to just modify that one bit.
What you might want is some way to say that you want both these
modifications of the CSR register to happen simultaneously. You want
to write a 1 into the func bits, and set the go bit.
bit, if you have bitfields in your high level language, you obvioysly
want to use that instead of playing with this computational stuff to
work out what the value to write to the CSR register. And then you get
into these separate statement issues for changing the different
bitfields. And then you also get into the problems with how/when to
read/write the CSR register as well. In C you have the volatile
keyword, you you pretty much do need to use for anything connected to
hardware, but that also means the compiler are not allowed to optimize
any accesses and references to the memory. So it becomes a mess.
Johnny
Well, if I understand what you're trying to do, in Basic it might be simpler?
z% = 1% + 4% + 16% ! to set 3 bits, the rest zero
or (sic)
z% = z% or ( 1% + 4% + 16% ) ! to set the bits, leaving the rest alone
:-)
Uh... Well... Yes... But you missed the point.
The point was that you do not want to do that kind of calculations in
the code. You want to describe the CSR as bitfields, which match
documentation, and then work on those bitfields directly instead of
doing all kind of computation in the code beforehand to figure out
yourself what value should be written.

John Wallace also commented on code having side effects when just
reading or writing the register, which I sortof implicitly also touched
in my reply. Your code would normally also avoid that, since you just
touch the register at the end, but you do understand (I hope) that you
will be doing both a read and a write to the register when you do an OR.
(Just as with my BIS)

Essentially what you did was just replicated what I did with my
assembler code, while not seeming to understand what I was trying to
explain with the C code, which was the actual point. :-)

Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
David Froble
2017-01-16 00:38:48 UTC
Permalink
Raw Message
Post by Johnny Billquist
Post by David Froble
Post by Johnny Billquist
I think Simons point was about updating several fields in one operation.
struct CSR {
func: 11;
mod: 4;
go: 1;
};
volatile struct CSR foo;
.
.
foo.func = 1;
foo.go = 1;
Now, when you modify func, the CSR will be updated immediately with
that value, leaving all other bits alone. However, the actual access
will be to the full 16 bit always, because of how the hardware works.
The writing of a 1 into the go bit, will the cause another read and
write of the CSR, to just modify that one bit.
What you might want is some way to say that you want both these
modifications of the CSR register to happen simultaneously. You want
to write a 1 into the func bits, and set the go bit.
bit, if you have bitfields in your high level language, you obvioysly
want to use that instead of playing with this computational stuff to
work out what the value to write to the CSR register. And then you get
into these separate statement issues for changing the different
bitfields. And then you also get into the problems with how/when to
read/write the CSR register as well. In C you have the volatile
keyword, you you pretty much do need to use for anything connected to
hardware, but that also means the compiler are not allowed to optimize
any accesses and references to the memory. So it becomes a mess.
Johnny
Well, if I understand what you're trying to do, in Basic it might be simpler?
z% = 1% + 4% + 16% ! to set 3 bits, the rest zero
or (sic)
z% = z% or ( 1% + 4% + 16% ) ! to set the bits, leaving the rest alone
:-)
Uh... Well... Yes... But you missed the point.
The point was that you do not want to do that kind of calculations in
the code. You want to describe the CSR as bitfields, which match
documentation, and then work on those bitfields directly instead of
doing all kind of computation in the code beforehand to figure out
yourself what value should be written.
I'm more than a bit (sic) familiar with the mnumonics used in VMS to represent
bits and offsets. I'm assuming this is what you're talking about?
Post by Johnny Billquist
John Wallace also commented on code having side effects when just
reading or writing the register, which I sortof implicitly also touched
in my reply. Your code would normally also avoid that, since you just
touch the register at the end, but you do understand (I hope) that you
will be doing both a read and a write to the register when you do an OR.
(Just as with my BIS)
Not sure how the Basic compiler would implement the above, but, since the
compiler is not known for efficiency, would not surprise me if it did more work
than necessary.

Nor was it clear to me that were talking about registers ....
Post by Johnny Billquist
Essentially what you did was just replicated what I did with my
assembler code, while not seeming to understand what I was trying to
explain with the C code, which was the actual point. :-)
Johnny
Well, yes, me not understanding anything in C is well known ....
Johnny Billquist
2017-01-17 19:17:04 UTC
Permalink
Raw Message
Post by David Froble
Post by Johnny Billquist
Post by David Froble
Well, if I understand what you're trying to do, in Basic it might be simpler?
z% = 1% + 4% + 16% ! to set 3 bits, the rest zero
or (sic)
z% = z% or ( 1% + 4% + 16% ) ! to set the bits, leaving the rest alone
:-)
Uh... Well... Yes... But you missed the point.
The point was that you do not want to do that kind of calculations in
the code. You want to describe the CSR as bitfields, which match
documentation, and then work on those bitfields directly instead of
doing all kind of computation in the code beforehand to figure out
yourself what value should be written.
I'm more than a bit (sic) familiar with the mnumonics used in VMS to
represent bits and offsets. I'm assuming this is what you're talking
about?
Well, it wasn't me talking... I was just trying to interpret what Simon
was talking about.
Post by David Froble
Post by Johnny Billquist
John Wallace also commented on code having side effects when just
reading or writing the register, which I sortof implicitly also
touched in my reply. Your code would normally also avoid that, since
you just touch the register at the end, but you do understand (I hope)
that you will be doing both a read and a write to the register when
you do an OR.
(Just as with my BIS)
Not sure how the Basic compiler would implement the above, but, since
the compiler is not known for efficiency, would not surprise me if it
did more work than necessary.
Agreed. But all of that is honestly more of a side topic to the basic
thing that Simon was going on about.
Post by David Froble
Nor was it clear to me that were talking about registers ....
Registers are the most obvious place where this matters. But sometimes
it might also matter in other contexts, with other type of values.
Post by David Froble
Post by Johnny Billquist
Essentially what you did was just replicated what I did with my
assembler code, while not seeming to understand what I was trying to
explain with the C code, which was the actual point. :-)
Johnny
Well, yes, me not understanding anything in C is well known ....
Fair enough. So the basic point was that instead of just looking at a
value (variable, register, whatever) as a 16 bit integer (or whatever
size), you can describe the various fields inside that value, and let
the compiler figure out how much to shift and fiddle, and so on.
But the problem then becomes a question of atomicity, if you want to
change several bitfields.
And that is what my C code tried to illustrate, and then I gave some
assembler code to illustrate how you'd probably do things today,
computing everything before doing a single update of the
variable/register/whatever.

And Simon have a proposal for Ada (if I read things right) which would
allow you to describe such a thing with bitfields, and still be able to
change several of them with just one write in the resulting compiled code.

Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
Simon Clubley
2017-01-17 19:27:54 UTC
Permalink
Raw Message
Post by Johnny Billquist
Fair enough. So the basic point was that instead of just looking at a
value (variable, register, whatever) as a 16 bit integer (or whatever
size), you can describe the various fields inside that value, and let
the compiler figure out how much to shift and fiddle, and so on.
But the problem then becomes a question of atomicity, if you want to
change several bitfields.
And that is what my C code tried to illustrate, and then I gave some
assembler code to illustrate how you'd probably do things today,
computing everything before doing a single update of the
variable/register/whatever.
And Simon have a proposal for Ada (if I read things right) which would
allow you to describe such a thing with bitfields, and still be able to
change several of them with just one write in the resulting compiled code.
You understand correctly Johnny. This is exactly what the proposal
is all about and the critical thing is that you would be able to
change them all directly at the same time within one assignment
statement instead of having to use a temporary variable to store the
changes as you build them up field by field.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Simon Clubley
2017-01-16 00:51:52 UTC
Permalink
Raw Message
Post by David Froble
Well, if I understand what you're trying to do, in Basic it might be simpler?
z% = 1% + 4% + 16% ! to set 3 bits, the rest zero
No. This is just using Basic syntax instead of C syntax to manipulate
an opaque integer (as far as the compiler is concerned) by using
bitmasks. This is the standard way it's done in C (and Ada at the
moment unless you use a temporary variable to map a record; the
temporary variable is preferred but messy).

This kind of opaque integer code can get ugly and error prone very
quickly.

The following bit of homework (if you are so inclined to do it :-))
should make it clear.

Assume z% is a 32 bit integer the address of which is directly mapped
onto a 32 bit device register. We will call the least significant byte
of z% byte 0 and the most significant byte of z% is byte 3.

Write exactly one Basic statement to update z% in the following way:

1) Set byte 2 to 0
2) Set byte 0 to 33
3) Set the lower 4 bits of byte 1 to 12
4) Preserve the upper 4 bits of byte 1 and the whole of byte 3.

Now imagine that you could instead define a record which maps on to
this 32 bit device register and further assume the language gives
you the ability to list the multiple fields to be updated in a single
assignment statement so that you only do a single Read/Modify/Write
sequence.

This is much cleaner than the above code and is what will hopefully
be in the next version of Ada.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
j***@yahoo.co.uk
2017-01-16 01:20:02 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by David Froble
Well, if I understand what you're trying to do, in Basic it might be simpler?
z% = 1% + 4% + 16% ! to set 3 bits, the rest zero
No. This is just using Basic syntax instead of C syntax to manipulate
an opaque integer (as far as the compiler is concerned) by using
bitmasks. This is the standard way it's done in C (and Ada at the
moment unless you use a temporary variable to map a record; the
temporary variable is preferred but messy).
This kind of opaque integer code can get ugly and error prone very
quickly.
The following bit of homework (if you are so inclined to do it :-))
should make it clear.
Assume z% is a 32 bit integer the address of which is directly mapped
onto a 32 bit device register. We will call the least significant byte
of z% byte 0 and the most significant byte of z% is byte 3.
1) Set byte 2 to 0
2) Set byte 0 to 33
3) Set the lower 4 bits of byte 1 to 12
4) Preserve the upper 4 bits of byte 1 and the whole of byte 3.
Now imagine that you could instead define a record which maps on to
this 32 bit device register and further assume the language gives
you the ability to list the multiple fields to be updated in a single
assignment statement so that you only do a single Read/Modify/Write
sequence.
This is much cleaner than the above code and is what will hopefully
be in the next version of Ada.
Simon.
--
Microsoft: Bringing you 1980s technology to a 21st century world
Hardware used to behave like this (e.g. indivisible read
modify write bus cycles for use with hw registers where a
read has side effects), back in PDP11 days.

Modern RISC processors frequently don't know how to do a
generic indivisible read modify write operation, surely?
They can load from "memory" addresses to CPU registers,
and store from CPU registers to "memory" addresses, but
can't do the two or three "memory" operand instructions
that PDP11 and VAX used to do.

Consequently they must use other techniques for indivisible
operations, and hardware designers surely have to
accomodate that?

If that's the case, I'm a bit at a loss to see how *any*
language can guarantee to make the generic indivisible
RMW facility available. But maybe it has other uses
currently invisible to me.

May be getting a bit off topic for nuVMS though. Or maybe
not?
Simon Clubley
2017-01-16 01:31:32 UTC
Permalink
Raw Message
Post by j***@yahoo.co.uk
Hardware used to behave like this (e.g. indivisible read
modify write bus cycles for use with hw registers where a
read has side effects), back in PDP11 days.
Modern RISC processors frequently don't know how to do a
generic indivisible read modify write operation, surely?
They can load from "memory" addresses to CPU registers,
and store from CPU registers to "memory" addresses, but
can't do the two or three "memory" operand instructions
that PDP11 and VAX used to do.
Sorry, but you have missed the point. :-)

It's not the hardware at play here but the code generated by the
compiler which cannot be allowed to read and write the same device
register multiple times while updating multiple fields in the
device register as part of one operation.

This is why most people just take the easy (but ugly and error
prone) method of using an opaque integer for the device register
and apply bitmasks on this integer.

What I want to see introduced into the next version of Ada is
the ability to use records to model a device register (which you
can currently do) and then update multiple fields in that record
as part of one assignment statement and hence a single RMW sequence
(which you cannot currently do in Ada).

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
David Froble
2017-01-16 07:06:39 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by David Froble
Well, if I understand what you're trying to do, in Basic it might be simpler?
z% = 1% + 4% + 16% ! to set 3 bits, the rest zero
No. This is just using Basic syntax instead of C syntax to manipulate
an opaque integer (as far as the compiler is concerned) by using
bitmasks. This is the standard way it's done in C (and Ada at the
moment unless you use a temporary variable to map a record; the
temporary variable is preferred but messy).
This kind of opaque integer code can get ugly and error prone very
quickly.
Well, if you're as young as you claim, then what we're talking about has served
me well since before you were born.

With a bit (sic) of care, it's not ugly, or error prone, if a coder is
proficient, and careful with his work.
Post by Simon Clubley
The following bit of homework (if you are so inclined to do it :-))
should make it clear.
Assume z% is a 32 bit integer the address of which is directly mapped
onto a 32 bit device register. We will call the least significant byte
of z% byte 0 and the most significant byte of z% is byte 3.
1) Set byte 2 to 0
2) Set byte 0 to 33
3) Set the lower 4 bits of byte 1 to 12
4) Preserve the upper 4 bits of byte 1 and the whole of byte 3.
I could do that. Maybe not tonight, but if you don't believe me, let me know
and I'll provide the code.
Post by Simon Clubley
Now imagine that you could instead define a record which maps on to
this 32 bit device register and further assume the language gives
you the ability to list the multiple fields to be updated in a single
assignment statement so that you only do a single Read/Modify/Write
sequence.
This is much cleaner than the above code and is what will hopefully
be in the next version of Ada.
There is no perfection. But, while I'll admit things could always be easier at
one point, it's because more work is done elsewhere.

Just because some people say things enough times, they may get some to think
it's true. It's not. Work is work. TANSTAAFL! Whether it's in a more complex
compiler, or in packaged procedures (PYTHON), or however, it all comes down to
ones and zeros and the work to perform a task is pretty much the same,
regardless of where it's done.

Hey, I'm a great believer in packaged procedures. Over the last 40 some years,
I've written many, and use them often. Got whole libraries of such. Might do
some things you never thought of.
Simon Clubley
2017-01-16 14:03:58 UTC
Permalink
Raw Message
Post by David Froble
Post by Simon Clubley
Post by David Froble
Well, if I understand what you're trying to do, in Basic it might be simpler?
z% = 1% + 4% + 16% ! to set 3 bits, the rest zero
No. This is just using Basic syntax instead of C syntax to manipulate
an opaque integer (as far as the compiler is concerned) by using
bitmasks. This is the standard way it's done in C (and Ada at the
moment unless you use a temporary variable to map a record; the
temporary variable is preferred but messy).
This kind of opaque integer code can get ugly and error prone very
quickly.
Well, if you're as young as you claim, then what we're talking about has served
me well since before you were born.
I was around 40 years ago when you started, but not in the early 1960s.
In other words, the original TECO is actually older than I am...
Post by David Froble
With a bit (sic) of care, it's not ugly, or error prone, if a coder is
proficient, and careful with his work.
Careful David, you are starting to sound like some C programmers. :-)
Post by David Froble
Post by Simon Clubley
The following bit of homework (if you are so inclined to do it :-))
should make it clear.
Assume z% is a 32 bit integer the address of which is directly mapped
onto a 32 bit device register. We will call the least significant byte
of z% byte 0 and the most significant byte of z% is byte 3.
1) Set byte 2 to 0
2) Set byte 0 to 33
3) Set the lower 4 bits of byte 1 to 12
4) Preserve the upper 4 bits of byte 1 and the whole of byte 3.
I could do that. Maybe not tonight, but if you don't believe me, let me know
and I'll provide the code.
Actually, I fully believe you can do it just fine. All I am saying is
that it's possible to come up with language constructs which do it
much more elegantly.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
David Froble
2017-01-16 16:33:57 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by David Froble
Post by Simon Clubley
Post by David Froble
Well, if I understand what you're trying to do, in Basic it might be simpler?
z% = 1% + 4% + 16% ! to set 3 bits, the rest zero
No. This is just using Basic syntax instead of C syntax to manipulate
an opaque integer (as far as the compiler is concerned) by using
bitmasks. This is the standard way it's done in C (and Ada at the
moment unless you use a temporary variable to map a record; the
temporary variable is preferred but messy).
This kind of opaque integer code can get ugly and error prone very
quickly.
Well, if you're as young as you claim, then what we're talking about has served
me well since before you were born.
I was around 40 years ago when you started, but not in the early 1960s.
In other words, the original TECO is actually older than I am...
Post by David Froble
With a bit (sic) of care, it's not ugly, or error prone, if a coder is
proficient, and careful with his work.
Careful David, you are starting to sound like some C programmers. :-)
Maybe because they are right. Not that I personally like the syntax of C, but,
people used to program with jumpers, and if done correctly, it worked.
Post by Simon Clubley
Post by David Froble
Post by Simon Clubley
The following bit of homework (if you are so inclined to do it :-))
should make it clear.
Assume z% is a 32 bit integer the address of which is directly mapped
onto a 32 bit device register. We will call the least significant byte
of z% byte 0 and the most significant byte of z% is byte 3.
1) Set byte 2 to 0
2) Set byte 0 to 33
3) Set the lower 4 bits of byte 1 to 12
4) Preserve the upper 4 bits of byte 1 and the whole of byte 3.
I could do that. Maybe not tonight, but if you don't believe me, let me know
and I'll provide the code.
Actually, I fully believe you can do it just fine. All I am saying is
that it's possible to come up with language constructs which do it
much more elegantly.
Simon.
As I've written in the past, more than once, I believe that the computer is
there to serve (sic) and that means doing the grunt work. Sure, if a compiler
can be set up to do things in an easier manner, that's a good thing. Easier is
better.

But, really, we're talking about two things here, one is better tools, which is
good. The other is coder proficiency. As many have observed, regardless of the
tools, both good and bad code can be written.

I do get a bit bent out of shape when people tend to blame the tools. Some may
be better than others, but it's the programmer's task to understand the tools
(s)he's using. To just say things, such as Macro-32 is bad, just means the
person saying that doesn't understand the tool. Sure, it may take a bit more
effort (or a lot) to write good maintainable code, but it can be done. If it
isn't, it's mis-use of the tool, not the tool.

This is a complex issue, and there is no one answer. Should we call some PHP
hacker incompetent because (s)he doesn't understand binary? Or do we again look
at the computer as a tool to do the work? Should a user need to understand
binary in order to do some work? Maybe, maybe not.

I will observe that the PHP hacker might not like my opinion ....

:-)
Bob Gezelter
2017-01-17 16:59:43 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by David Froble
Post by Simon Clubley
Post by David Froble
Well, if I understand what you're trying to do, in Basic it might be simpler?
z% = 1% + 4% + 16% ! to set 3 bits, the rest zero
No. This is just using Basic syntax instead of C syntax to manipulate
an opaque integer (as far as the compiler is concerned) by using
bitmasks. This is the standard way it's done in C (and Ada at the
moment unless you use a temporary variable to map a record; the
temporary variable is preferred but messy).
This kind of opaque integer code can get ugly and error prone very
quickly.
Well, if you're as young as you claim, then what we're talking about has served
me well since before you were born.
I was around 40 years ago when you started, but not in the early 1960s.
In other words, the original TECO is actually older than I am...
Post by David Froble
With a bit (sic) of care, it's not ugly, or error prone, if a coder is
proficient, and careful with his work.
Careful David, you are starting to sound like some C programmers. :-)
Post by David Froble
Post by Simon Clubley
The following bit of homework (if you are so inclined to do it :-))
should make it clear.
Assume z% is a 32 bit integer the address of which is directly mapped
onto a 32 bit device register. We will call the least significant byte
of z% byte 0 and the most significant byte of z% is byte 3.
1) Set byte 2 to 0
2) Set byte 0 to 33
3) Set the lower 4 bits of byte 1 to 12
4) Preserve the upper 4 bits of byte 1 and the whole of byte 3.
I could do that. Maybe not tonight, but if you don't believe me, let me know
and I'll provide the code.
Actually, I fully believe you can do it just fine. All I am saying is
that it's possible to come up with language constructs which do it
much more elegantly.
Simon.
--
Microsoft: Bringing you 1980s technology to a 21st century world
Gentlemen,

Not quite. Underlying presumption is that the device register is read/write. If one looks up the details of many devices, one will discover, upon careful reading, that x = x | 1 will not just set low order bit.

What it will do is:

- read location x
- or in a 0x01
- write back location x

This is true on ALL architectures. Even the BIS instruction on the PDP-11 does this as a Read-Modify-Write. If the device register is actually two registers (one read-only and one write-only) at the same address, the described code will not work.

The only way to do it is to keep a "shadow" copy of the contents of the write-only register and re-write the device register in its entirety.

Have seen many fall into this semantic trap.

- Bob Gezelter, http://www.rlgsc.com
Simon Clubley
2017-01-17 19:19:19 UTC
Permalink
Raw Message
Post by Bob Gezelter
Gentlemen,
Not quite. Underlying presumption is that the device register is read/write.
Most of the registers I deal with are exactly that. The rest are
usually either FIFOs or the type of registers which fall into the
write 1 to do something or write 0 to leave the same something alone
category or some write-only configuration register. In any case,
a RMW sequence is not required for the latter types of register.
Post by Bob Gezelter
If one looks up the details of many devices, one will discover, upon careful
reading, that x = x | 1 will not just set low order bit.
- read location x
- or in a 0x01
- write back location x
Careful reading is not required, at least for the type of people
reading comp.os.vms who do this stuff. I think you will find that
everyone around here who had ever done anything with device registers
knows this and that it's the compiler and not the device which
determines this (and the size of the corresponding memory access).
Post by Bob Gezelter
This is true on ALL architectures. Even the BIS instruction on the PDP-11
does this as a Read-Modify-Write. If the device register is actually two
registers (one read-only and one write-only) at the same address, the
described code will not work.
The only way to do it is to keep a "shadow" copy of the contents of the
write-only register and re-write the device register in its entirety.
In the write only registers I am familiar with I can't think of a case
off the top of my head where I need to maintain a shadow copy of the
register contents as opposed to generating the value to be written on
the fly as required. I'm actually curious to know which type of hardware
you have encountered which requires this. Is it some old DEC hardware ?

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Johnny Billquist
2017-01-17 19:26:48 UTC
Permalink
Raw Message
Post by David Froble
Post by Simon Clubley
Post by David Froble
Well, if I understand what you're trying to do, in Basic it might be simpler?
z% = 1% + 4% + 16% ! to set 3 bits, the rest zero
No. This is just using Basic syntax instead of C syntax to manipulate
an opaque integer (as far as the compiler is concerned) by using
bitmasks. This is the standard way it's done in C (and Ada at the
moment unless you use a temporary variable to map a record; the
temporary variable is preferred but messy).
This kind of opaque integer code can get ugly and error prone very
quickly.
Well, if you're as young as you claim, then what we're talking about has
served me well since before you were born.
With a bit (sic) of care, it's not ugly, or error prone, if a coder is
proficient, and careful with his work.
Definitely can be done, and can be done in better or worse ways. But I
think it makes sense to deal with the bitfields more explicitly, and not
just do random integer arithmetic, and hope it adds up. :-)
Post by David Froble
Post by Simon Clubley
The following bit of homework (if you are so inclined to do it :-))
should make it clear.
Assume z% is a 32 bit integer the address of which is directly mapped
onto a 32 bit device register. We will call the least significant byte
of z% byte 0 and the most significant byte of z% is byte 3.
1) Set byte 2 to 0
2) Set byte 0 to 33
3) Set the lower 4 bits of byte 1 to 12
4) Preserve the upper 4 bits of byte 1 and the whole of byte 3.
I could do that. Maybe not tonight, but if you don't believe me, let me
know and I'll provide the code.
Of course you can. But the issue is that it is rather ugly code, might
be hard to read, and also hard to see what maps where in various
bitfields. Noone is saying it can't be done. We have been doing it for
the last 50 years or so... :-)

Simon could have made it more challenging by not just deal with low bits
of each byte.

Yes, in the end, it all ends up as shifts, AND, OR and NOT operations.
Nothing more to it. The trick is that you cannot express a lot of this
in any high level language, and have the compiler combine all those
operations into just one write in the end.
Post by David Froble
Post by Simon Clubley
Now imagine that you could instead define a record which maps on to
this 32 bit device register and further assume the language gives
you the ability to list the multiple fields to be updated in a single
assignment statement so that you only do a single Read/Modify/Write
sequence.
This is much cleaner than the above code and is what will hopefully
be in the next version of Ada.
There is no perfection. But, while I'll admit things could always be
easier at one point, it's because more work is done elsewhere.
Always a trade off. Of course. :-)
But why not let computers do the boring tasks?
Post by David Froble
Just because some people say things enough times, they may get some to
think it's true. It's not. Work is work. TANSTAAFL! Whether it's in
a more complex compiler, or in packaged procedures (PYTHON), or however,
it all comes down to ones and zeros and the work to perform a task is
pretty much the same, regardless of where it's done.
Yes.
Post by David Froble
Hey, I'm a great believer in packaged procedures. Over the last 40 some
years, I've written many, and use them often. Got whole libraries of
such. Might do some things you never thought of.
In this case, I believe more in that it's things you haven't had to deal
with much. :-)

Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
Arne Vajhøj
2017-01-15 22:25:41 UTC
Permalink
Raw Message
Post by Johnny Billquist
Post by Hans Vlems
I'm still not sure that I understand the objective. If it has to do
with accessing individual bits in a register (or variable) then look
integer i;
i.[2:1]:=1;
Where [B:L] represents B the starting bit (in the range 0 .. 47 for
Burroughs) and L the number of bits involved, where B is the most
significant bit in the series.
The code snippet leaves 4 in i.
I think Simons point was about updating several fields in one operation.
struct CSR {
func: 11;
mod: 4;
go: 1;
};
volatile struct CSR foo;
foo.func = 1;
foo.go = 1;
That C code may work like described above.

But the C standard at least C99 leaves a lot to the
implementation.

C99:

<quote>
An implementation may allocate any addressable storage unit large enough
to hold a bitfield.
If enough space remains, a bit-field that immediately follows another
bit-field in a
structure shall be packed into adjacent bits of the same unit. If
insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is
implementation-defined. The order of allocation of bit-fields within a
unit (high-order to
low-order or low-order to high-order) is implementation-defined. The
alignment of the
addressable storage unit is unspecified.
</quote>

Arne
Johnny Billquist
2017-01-17 19:31:11 UTC
Permalink
Raw Message
Post by Arne Vajhøj
Post by Johnny Billquist
Post by Hans Vlems
I'm still not sure that I understand the objective. If it has to do
with accessing individual bits in a register (or variable) then look
integer i;
i.[2:1]:=1;
Where [B:L] represents B the starting bit (in the range 0 .. 47 for
Burroughs) and L the number of bits involved, where B is the most
significant bit in the series.
The code snippet leaves 4 in i.
I think Simons point was about updating several fields in one operation.
struct CSR {
func: 11;
mod: 4;
go: 1;
};
volatile struct CSR foo;
foo.func = 1;
foo.go = 1;
That C code may work like described above.
But the C standard at least C99 leaves a lot to the
implementation.
Pretty much everything is implementation defined with bitfields in C,
which is why I pretty much never use them.

But that is besides the point of the discussion I was trying to drive.

Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
Simon Clubley
2017-01-16 01:19:20 UTC
Permalink
Raw Message
Post by Johnny Billquist
Post by Hans Vlems
integer i;
i.[2:1]:=1;
Where [B:L] represents B the starting bit (in the range 0 .. 47 for Burroughs) and L the number of bits involved, where B is the most significant bit in the series.
The code snippet leaves 4 in i.
I think Simons point was about updating several fields in one operation.
That's exactly what I am trying to do and without using bitmasks or
temporary variables.
Post by Johnny Billquist
struct CSR {
func: 11;
mod: 4;
go: 1;
};
volatile struct CSR foo;
.
.
foo.func = 1;
foo.go = 1;
Now, when you modify func, the CSR will be updated immediately with that
value, leaving all other bits alone. However, the actual access will be
to the full 16 bit always, because of how the hardware works.
The writing of a 1 into the go bit, will the cause another read and
write of the CSR, to just modify that one bit.
What you might want is some way to say that you want both these
modifications of the CSR register to happen simultaneously. You want to
write a 1 into the func bits, and set the go bit.
This is the absolutely critical bit.

Using Johnny's example above, on current hardware, there must be
_exactly_ one read of the device register into a CPU register (say R0),
the "func" and "go" fields must then be updated in R0 by the compiler's
generated code and then there must be _exactly_ one write of R0 back
to the device register.

Reads from a device register can have side effects and writes usually
do so all the fields in the device register which need altering
generally must be altered at the same time.

I've even worked with an ARM MCU (I forget which one) which had some
device registers where when some fields within the register are
updated you need to set a second field in the register as part of the
same write to say the first field has been updated.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Bob Gezelter
2017-01-16 13:35:47 UTC
Permalink
Raw Message
Post by Johnny Billquist
Post by Hans Vlems
integer i;
i.[2:1]:=1;
Where [B:L] represents B the starting bit (in the range 0 .. 47 for Burroughs) and L the number of bits involved, where B is the most significant bit in the series.
The code snippet leaves 4 in i.
I think Simons point was about updating several fields in one operation.
struct CSR {
func: 11;
mod: 4;
go: 1;
};
volatile struct CSR foo;
.
.
foo.func = 1;
foo.go = 1;
Now, when you modify func, the CSR will be updated immediately with that
value, leaving all other bits alone. However, the actual access will be
to the full 16 bit always, because of how the hardware works.
The writing of a 1 into the go bit, will the cause another read and
write of the CSR, to just modify that one bit.
What you might want is some way to say that you want both these
modifications of the CSR register to happen simultaneously. You want to
write a 1 into the func bits, and set the go bit.
bit, if you have bitfields in your high level language, you obvioysly
want to use that instead of playing with this computational stuff to
work out what the value to write to the CSR register. And then you get
into these separate statement issues for changing the different
bitfields. And then you also get into the problems with how/when to
read/write the CSR register as well. In C you have the volatile keyword,
you you pretty much do need to use for anything connected to hardware,
but that also means the compiler are not allowed to optimize any
accesses and references to the memory. So it becomes a mess.
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
pdp is alive! || tryin' to stay hip" - B. Idol
Johnny,

One need be careful in this particular neighborhood.

It can be annoying, but more than a few devices have have overmapped registers (a read-only and a write-only register, with the same address, but different semantics and contents).

In these situations, updating a field in the write only register REQUIRES a shadow copy of the entire write-only CSR, a Read/Modify/Write sequence will not yield the desired result.

Languages can be particularly poor at dealing with this semantic.

- Bob Gezelter, http://www.rlgsc.com
Simon Clubley
2017-01-16 14:14:31 UTC
Permalink
Raw Message
Post by Bob Gezelter
Johnny,
One need be careful in this particular neighborhood.
It can be annoying, but more than a few devices have have overmapped
registers (a read-only and a write-only register, with the same address, but
different semantics and contents).
On the few device registers that I've seen with that setup you never
need to do a RMW sequence anyway. I don't know about older devices but
quite a lot of the time with current devices when that is true it's
usually some sort of byte/word/longword FIFO anyway so bitfields are
not involved.

In some other cases, it's one of those write 1 to change something or
zero to leave it alone registers.
Post by Bob Gezelter
In these situations, updating a field in the write only register
REQUIRES a shadow copy of the entire write-only CSR, a Read/Modify/Write
sequence will not yield the desired result.
Is this on the older DEC hardware ? I can't think of something off
the top of my head which I have dealt with which requires a full
shadow copy of a write only register in order to handle this situation.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Johnny Billquist
2017-01-17 19:33:26 UTC
Permalink
Raw Message
Post by Bob Gezelter
Johnny,
One need be careful in this particular neighborhood.
It can be annoying, but more than a few devices have have overmapped registers (a read-only and a write-only register, with the same address, but different semantics and contents).
In these situations, updating a field in the write only register REQUIRES a shadow copy of the entire write-only CSR, a Read/Modify/Write sequence will not yield the desired result.
Languages can be particularly poor at dealing with this semantic.
Bob. You are absolutely right about that, but that is a different
problem and topic. :-)

Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
John Reagan
2017-01-15 15:05:38 UTC
Permalink
Raw Message
Post by Hans Vlems
integer i;
i.[2:1]:=1;
Where [B:L] represents B the starting bit (in the range 0 .. 47 for Burroughs) and L the number of bits involved, where B is the most significant bit in the series.
The code snippet leaves 4 in i.

The equivalent BLISS is

LOCAL I : INITIAL (0);
I<2,1> = 1;
Bob Koehler
2017-01-17 14:26:33 UTC
Permalink
Raw Message
Post by Hans Vlems
integer i;
i.[2:1]:=1;
Where [B:L] represents B the starting bit (in the range 0 .. 47 for Burroughs) and L the number of bits involved, where B is the most significant bit in the series.
The code snippet leaves 4 in i.
Hans
Bliss also has a solution, which allows the user to specify the start
bit, the size, and the extension bit. But I prefer to layout bit
fields as in C, listing the fields in a struct.
Louis Krupp
2017-01-13 21:27:30 UTC
Permalink
Raw Message
On Fri, 13 Jan 2017 11:03:54 -0800 (PST), Hans Vlems
Post by Hans Vlems
Louis, I wrote a similar comment and it was ignored. Possibly because MCP systems are thought extinct or may be Algol is not as exciting as Rust. I mean anyone can easily read Algol (and NEWP) while C derivatives offer more options for very learned discussions on language semantics...
Hans
Hans,

I'm not seeing your post in this thread. Perhaps no one else saw it
either, as I'm sure someone would have replied.

I agree with your comment on languages. I would speculate that
bounded-context, recursive descent compilation makes for a more
civilized language.

(We're a little off-topic for VMS. Maybe someone who knows something
about OpenVMS compilers will chime in.)

Louis
Simon Clubley
2017-01-13 13:44:22 UTC
Permalink
Raw Message
Post by Arne Vajhøj
But the unsafe block concept (which I believe was first introduced
in C#) is a great improvement IMHO.
What about Modula-3's unsafe module capability ?

(Modula-3 came before C#)

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Bill Gunshannon
2017-01-13 14:21:03 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by Arne Vajhøj
But the unsafe block concept (which I believe was first introduced
in C#) is a great improvement IMHO.
What about Modula-3's unsafe module capability ?
(Modula-3 came before C#)
But did anyone ever use Modula-3? :-)

bill
Simon Clubley
2017-01-13 20:14:29 UTC
Permalink
Raw Message
Post by Bill Gunshannon
Post by Simon Clubley
What about Modula-3's unsafe module capability ?
(Modula-3 came before C#)
But did anyone ever use Modula-3? :-)
Would you like a saucer of milk ? :-)

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Arne Vajhøj
2017-01-13 14:22:05 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by Arne Vajhøj
But the unsafe block concept (which I believe was first introduced
in C#) is a great improvement IMHO.
What about Modula-3's unsafe module capability ?
(Modula-3 came before C#)
Another predecessor!

:-)

Arne
Stephen Hoffman
2017-01-13 15:01:48 UTC
Permalink
Raw Message
Post by John Reagan
Rust is no different than other languages. They have an entire book
dedicated to writing "correct unsafe Rust" code.
https://doc.rust-lang.org/nomicon/
This book digs into all the awful details that are necessary to
understand in order to write correct Unsafe Rust programs. Due to the
nature of this problem, it may lead to unleashing untold horrors that
shatter your psyche into a billion infinitesimal fragments of despair.
Should you wish a long and happy career of writing Rust programs, you
should turn back now and forget you ever saw this book. It is not
necessary. However if you intend to write unsafe code -- or just want
to dig into the guts of the language -- this book contains invaluable
information.
Ayup. As the implementation and terminology might not be clear to
somebody that's not looked at Rust...

https://doc.rust-lang.org/book/unsafe.html

The difference from C being that you have to enable writing unsafe code
in Rust, either for specific reasons in the code or because you're
calling into an external library written in some other language. Rust
makes calling C routines pretty easy. C itself doesn't differentiate
safe from unsafe code. In some projects, we've #ifdef'd calls we know
are risky and that have accordingly been (locally) deprecated, but null
handling in C can be... interesting to get right. Source code
scanning tools can help here, as can a compiler that deprecates calls
for you, and detects undefined behavior. Rust lacks UB, outside of the
unsafe blocks.

Specifically unsafe code in Rust is rather more akin to everyday C
code, unfortunately.

I'd not expect VSI to add Rust, Go or such to the repertoire. Not any
time soon, that is. Rust isn't a panacea. I would hope that the C
tools presently worked on and used by VSI and then eventually available
to us would be better about flagging undefined behavior, the riskiest
of C library calls, and such. Clang does decently well and is on par
with the existing OpenVMS C compilers around the compilation
processing, and the Clang error messages are pretty good. But being C,
there's always and unfortunately easy access to undefined behavior.
And while I like C... I've learned to trust it rather less; not as
much as I used to.
--
Pure Personal Opinion | HoffmanLabs LLC
Simon Clubley
2017-01-13 20:21:04 UTC
Permalink
Raw Message
Post by Stephen Hoffman
In some projects, we've #ifdef'd calls we know
are risky and that have accordingly been (locally) deprecated, but null
handling in C can be... interesting to get right.
Of course, there's no reason why the string handling library you
use in C has to have null terminated strings at the core of it's
implementation.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
John Reagan
2017-01-13 20:59:28 UTC
Permalink
Raw Message
From the C99 standard (sorry for the poor cut n paste on my phone).

A string is a contiguous sequence of characters terminated by and including the first null character.The termmultibyte stringis sometimes used instead to emphasize special processing given to multibyte characters contained in the string or to avoid confusion with a wide string.A pointer to a string is a pointer to its initial (lowest addressed)character.The length of a string is the number of bytes preceding the null character and the value of a string is the sequence of the values of the contained characters, in order.
Simon Clubley
2017-01-13 21:11:20 UTC
Permalink
Raw Message
Post by John Reagan
From the C99 standard (sorry for the poor cut n paste on my phone).
A string is a contiguous sequence of characters terminated by and including
the first null character.The termmultibyte stringis sometimes used instead to
emphasize special processing given to multibyte characters contained in the
string or to avoid confusion with a wide string.A pointer to a string is a
pointer to its initial (lowest addressed)character.The length of a string is
the number of bytes preceding the null character and the value of a string is
the sequence of the values of the contained characters, in order.
However, this doesn't change the fact that there are various safer
C string handling libraries which can be used in place the standard
C string handling libraries just as easily.

Here's one I just found using Google (although I have not used it):

http://bstring.sourceforge.net/

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Bill Gunshannon
2017-01-13 21:56:42 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by John Reagan
From the C99 standard (sorry for the poor cut n paste on my phone).
A string is a contiguous sequence of characters terminated by and including
the first null character.The termmultibyte stringis sometimes used instead to
emphasize special processing given to multibyte characters contained in the
string or to avoid confusion with a wide string.A pointer to a string is a
pointer to its initial (lowest addressed)character.The length of a string is
the number of bytes preceding the null character and the value of a string is
the sequence of the values of the contained characters, in order.
However, this doesn't change the fact that there are various safer
C string handling libraries which can be used in place the standard
C string handling libraries just as easily.
http://bstring.sourceforge.net/
Ans=d all os=f this was known and fixed about 40 years ago and the
industry soundly rejected it. Where is the company that marketed
SafeC today? Maybe theirs should have been the one ANSI looked at
instead of K&R.
(And this from someone who still thinks K&R is C and everything else
was a new, derived language!!)

bill
Stephen Hoffman
2017-01-13 22:45:31 UTC
Permalink
Raw Message
Post by John Reagan
From the C99 standard (sorry for the poor cut n paste on my phone).
A string is a contiguous sequence of characters terminated by and
including the first null character.The termmultibyte stringis sometimes
used instead to emphasize special processing given to multibyte
characters contained in the string or to avoid confusion with a wide
string.A pointer to a string is a pointer to its initial (lowest
addressed)character.The length of a string is the number of bytes
preceding the null character and the value of a string is the sequence
of the values of the contained characters, in order.
Ayup. C string handling sometimes makes me long for BASIC.
(Somebody please check on David. I think he may have fainted there.)
Or for C++, or — vastly better — Objective C.

C11 — C99 has been deprecated for a while, obviously — has somewhat
better support around UTF strings than the earlier C standards. Also
better threading. And BSD has the strl calls.

http://blog.smartbear.com/codereviewer/c11-a-new-c-standard-aiming-at-safer-programming/


Newer languages and platforms and tools tend to do better here, too.
I'm dealing with bash and a terminal emulator that use UTF-8 — what was
that about the problems with staying compatible with old terminals and
not updating or replacing the terminal driver for more modern
requirements — all the way through. It's handy not to have to hack or
escape stuff to get past the limits of ASCII or ~MCS encoding.

I like C. But it has issues. Alternatives and upgrades are
interesting, too. And no, I don't expect VSI to decide to haul off
and start getting all Rust-y in their new development work.
--
Pure Personal Opinion | HoffmanLabs LLC
David Froble
2017-01-13 23:06:53 UTC
Permalink
Raw Message
Post by Stephen Hoffman
Post by John Reagan
From the C99 standard (sorry for the poor cut n paste on my phone).
A string is a contiguous sequence of characters terminated by and
including the first null character.The termmultibyte stringis
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string.A pointer to a string is a pointer to its initial
(lowest addressed)character.The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
Ayup. C string handling sometimes makes me long for BASIC.
(Somebody please check on David. I think he may have fainted there.)
Nope. Take more than that.

But, it's not the language, it's the compiler using decent library routines. I
could argue that Basic doesn't do anything with strings. At least from the
perspective that it's mostly or all library calls.

C just needs to be better at which library routines it uses ....

Regardless, such library routines are barred from kernel mode code ....
Johnny Billquist
2017-01-15 15:14:44 UTC
Permalink
Raw Message
Post by David Froble
Post by Stephen Hoffman
Post by John Reagan
From the C99 standard (sorry for the poor cut n paste on my phone).
A string is a contiguous sequence of characters terminated by and
including the first null character.The termmultibyte stringis
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string.A pointer to a string is a pointer to its initial
(lowest addressed)character.The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
Ayup. C string handling sometimes makes me long for BASIC.
(Somebody please check on David. I think he may have fainted there.)
Nope. Take more than that.
But, it's not the language, it's the compiler using decent library
routines. I could argue that Basic doesn't do anything with strings.
At least from the perspective that it's mostly or all library calls.
Partly, and partly not. BASIC do not really have this as libraries, even
thought the implementation might sit in some library. It's a part of the
language itself, and you cannot really separate the things in BASIC.
Post by David Froble
C just needs to be better at which library routines it uses ....
The problem (or one problem) with C is that the language don't really
have strings. You have pointers. And arrays... But actually, arrays are
pretty much just pointers as well. And strings are just arrays of
integers. And so, you have some convention on how to treat some arrays
as strings, in some special ways and cases. But the language still do
not have strings.

So, obviously, dealing with strings is always going to be a ride.

(But I do like C, I just don't consider it to be the solution to all the
problems in the world.)

Most the discussions here, however, seem to focus too much on languages,
as if that is the problem and the solution. In the end, in my
experience, it's all about good programmers. Bad ones will create
problems no matter what language they write in. And unfortunately, bad
programmers outnumber good ones by a big margin, and it's only getting
worse as both academia and industry now thinks that the tools are the
solution to all problems.

Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
David Froble
2017-01-15 19:45:22 UTC
Permalink
Raw Message
Post by Johnny Billquist
Post by David Froble
Post by Stephen Hoffman
Post by John Reagan
From the C99 standard (sorry for the poor cut n paste on my phone).
A string is a contiguous sequence of characters terminated by and
including the first null character.The termmultibyte stringis
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string.A pointer to a string is a pointer to its initial
(lowest addressed)character.The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
Ayup. C string handling sometimes makes me long for BASIC.
(Somebody please check on David. I think he may have fainted there.)
Nope. Take more than that.
But, it's not the language, it's the compiler using decent library
routines. I could argue that Basic doesn't do anything with strings.
At least from the perspective that it's mostly or all library calls.
Partly, and partly not. BASIC do not really have this as libraries, even
thought the implementation might sit in some library. It's a part of the
language itself, and you cannot really separate the things in BASIC.
The following is an example of what I'm trying to say:

Z2$MAIN
1 1 Z1$ = "abc"
2 Z2$ = "123"
3 Z3$ = Z1$ + Z2$
4 End

Z2$MAIN
Generated code
0000: .PSECT $CODE
0000: Z2$MAIN::
CFFC 0000: .WORD ^M<R2,R3,R4,R5,R6,R7,R8,R9,R10,R11,I>
52 FB AF 9E 0002: MOVAB .-3, R2
50 00000004 0G 9E 0006: MOVAB $PDATA+4, R0
51 50 D0 000D: MOVL R0, R1
00000000 GG 16 0010: JSB BAS$INIT_R8
0016: $T_0016:
FC AD FD AF 9E 0016: $L_1: MOVAB $L_1, -4(FP)
51 03 32 001B: CVTWL #3, R1
52 00000063 0G 9E 001E: MOVAB $PDATA+99, R2
50 5B AB 7E 0025: MOVAQ Z1$(R11), R0
00000000 GG 16 0029: JSB STR$COPY_R_R8
51 03 32 002F: CVTWL #3, R1
52 00000060 0G 9E 0032: MOVAB $PDATA+96, R2
50 63 AB 7E 0039: MOVAQ Z2$(R11), R0
00000000 GG 16 003D: JSB STR$COPY_R_R8
63 AB 7F 0043: PUSHAQ Z2$(R11)
5B AB 7F 0046: PUSHAQ Z1$(R11)
6B AB 7F 0049: PUSHAQ Z3$(R11)
00000000 GG 03 FB 004C: CALLS #3, STR$CONCAT
50 00000004 0G 9E 0053: MOVAB $PDATA+4, R0
00000000 GG 16 005A: JSB BAS$END_R8
50 01 D0 0060: MOVL #1, R0
04 0063: RET
0064: .END

Some snipping to attempt to make it fit and readable.

Note that the compiler does not produce any code to do any of the operations.
All it's doing is pushing arguments and invoking a library routine. This is an
example of what I've tried to say when I claim that Basic doesn't really do much
of the work.

And a John Reagan question. Why is there a RET at the end of a main? It's not
a subroutine.
Post by Johnny Billquist
Post by David Froble
C just needs to be better at which library routines it uses ....
The problem (or one problem) with C is that the language don't really
have strings. You have pointers. And arrays... But actually, arrays are
pretty much just pointers as well. And strings are just arrays of
integers. And so, you have some convention on how to treat some arrays
as strings, in some special ways and cases. But the language still do
not have strings.
I'll agree, the C compiler doesn't know about strings. But what is a string, or
any other variable or literal? It's an address, and perhaps some more data,
such as length. Which is how strings can be used in C, by setting up a
structure which includes an address, a length, and perhaps some other data.
Then all that's required is a library routine to work with the string.

The difference in Basic is that the compiler will set up the descriptor.
Post by Johnny Billquist
So, obviously, dealing with strings is always going to be a ride.
(But I do like C, I just don't consider it to be the solution to all the
problems in the world.)
Most the discussions here, however, seem to focus too much on languages,
as if that is the problem and the solution. In the end, in my
experience, it's all about good programmers. Bad ones will create
problems no matter what language they write in. And unfortunately, bad
programmers outnumber good ones by a big margin, and it's only getting
worse as both academia and industry now thinks that the tools are the
solution to all problems.
Now there you got my 100% agreement.
Arne Vajhøj
2017-01-15 22:16:00 UTC
Permalink
Raw Message
Post by David Froble
Z2$MAIN
1 1 Z1$ = "abc"
2 Z2$ = "123"
3 Z3$ = Z1$ + Z2$
4 End
Z2$MAIN
Generated code
0000: .PSECT $CODE
CFFC 0000: .WORD
^M<R2,R3,R4,R5,R6,R7,R8,R9,R10,R11,I>
04 0063: RET
0064: .END
And a John Reagan question. Why is there a RET at the end of a main?
It's not a subroutine.
Hmmm. I have always had ret at the end of a Macro-32 main. So I am
not surprised that the compiler generate it. If it was not there
then how should the execution stop?

Arne
David Froble
2017-01-16 00:43:50 UTC
Permalink
Raw Message
Post by Arne Vajhøj
Post by David Froble
Z2$MAIN
1 1 Z1$ = "abc"
2 Z2$ = "123"
3 Z3$ = Z1$ + Z2$
4 End
Z2$MAIN
Generated code
0000: .PSECT $CODE
CFFC 0000: .WORD
^M<R2,R3,R4,R5,R6,R7,R8,R9,R10,R11,I>
04 0063: RET
0064: .END
And a John Reagan question. Why is there a RET at the end of a main?
It's not a subroutine.
Hmmm. I have always had ret at the end of a Macro-32 main. So I am
not surprised that the compiler generate it. If it was not there
then how should the execution stop?
Arne
Could be. Though I'd think the .END would be adequate. I was thinking that it
might be necessary to insure the "1", or SS$_NORMAL was reported to the CLI upon
program exit.
Bob Koehler
2017-01-17 14:19:34 UTC
Permalink
Raw Message
Post by David Froble
Could be. Though I'd think the .END would be adequate. I was thinking that it
might be necessary to insure the "1", or SS$_NORMAL was reported to the CLI upon
program exit.
.END is not sufficient. That's just a directive to Macro-32 that the
instruction contributions to this routine are finished. It's not
seen by the CPU.
Johnny Billquist
2017-01-15 21:43:36 UTC
Permalink
Raw Message
Post by David Froble
Post by Johnny Billquist
Post by David Froble
Post by Stephen Hoffman
Post by John Reagan
From the C99 standard (sorry for the poor cut n paste on my phone).
A string is a contiguous sequence of characters terminated by and
including the first null character.The termmultibyte stringis
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string.A pointer to a string is a pointer to its initial
(lowest addressed)character.The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
Ayup. C string handling sometimes makes me long for BASIC.
(Somebody please check on David. I think he may have fainted there.)
Nope. Take more than that.
But, it's not the language, it's the compiler using decent library
routines. I could argue that Basic doesn't do anything with strings.
At least from the perspective that it's mostly or all library calls.
Partly, and partly not. BASIC do not really have this as libraries,
even thought the implementation might sit in some library. It's a part
of the language itself, and you cannot really separate the things in
BASIC.
Z2$MAIN
1 1 Z1$ = "abc"
2 Z2$ = "123"
3 Z3$ = Z1$ + Z2$
4 End
Z2$MAIN
Generated code
0000: .PSECT $CODE
CFFC 0000: .WORD
^M<R2,R3,R4,R5,R6,R7,R8,R9,R10,R11,I>
52 FB AF 9E 0002: MOVAB .-3, R2
50 00000004 0G 9E 0006: MOVAB $PDATA+4, R0
51 50 D0 000D: MOVL R0, R1
00000000 GG 16 0010: JSB BAS$INIT_R8
FC AD FD AF 9E 0016: $L_1: MOVAB $L_1, -4(FP)
51 03 32 001B: CVTWL #3, R1
52 00000063 0G 9E 001E: MOVAB $PDATA+99, R2
50 5B AB 7E 0025: MOVAQ Z1$(R11), R0
00000000 GG 16 0029: JSB STR$COPY_R_R8
51 03 32 002F: CVTWL #3, R1
52 00000060 0G 9E 0032: MOVAB $PDATA+96, R2
50 63 AB 7E 0039: MOVAQ Z2$(R11), R0
00000000 GG 16 003D: JSB STR$COPY_R_R8
63 AB 7F 0043: PUSHAQ Z2$(R11)
5B AB 7F 0046: PUSHAQ Z1$(R11)
6B AB 7F 0049: PUSHAQ Z3$(R11)
00000000 GG 03 FB 004C: CALLS #3, STR$CONCAT
50 00000004 0G 9E 0053: MOVAB $PDATA+4, R0
00000000 GG 16 005A: JSB BAS$END_R8
50 01 D0 0060: MOVL #1, R0
04 0063: RET
0064: .END
Some snipping to attempt to make it fit and readable.
Note that the compiler does not produce any code to do any of the
operations. All it's doing is pushing arguments and invoking a library
routine. This is an example of what I've tried to say when I claim that
Basic doesn't really do much of the work.
I know. But my point is that this is an implementation detail of the
specific compiler that you are using. You cannot see that it is a
library routine that is called from your code, you do not in any way
control what library routine to call, and this could all change without
your code being aware of any of it.

This is because you are actually invoking standard language features in
your source code. Exactly how those are implemented is not the same
thing as some reference to some external library explicit in the code.
Post by David Froble
And a John Reagan question. Why is there a RET at the end of a main?
It's not a subroutine.
I cannot fully explain how the compiler looks at all of this, but it
looks like it certainly sets up R0 with an success exit status code
before the RET.
Also, Z2$MAIN starts with an entry mask, which makes me suspect that
there is a generic start code in the BASIC RTS, which then calls your
program through a generic call, and at the end you normally returns to
the RTS code for cleanup before actual program exit.
Post by David Froble
Post by Johnny Billquist
Post by David Froble
C just needs to be better at which library routines it uses ....
The problem (or one problem) with C is that the language don't really
have strings. You have pointers. And arrays... But actually, arrays
are pretty much just pointers as well. And strings are just arrays of
integers. And so, you have some convention on how to treat some arrays
as strings, in some special ways and cases. But the language still do
not have strings.
I'll agree, the C compiler doesn't know about strings. But what is a
string, or any other variable or literal? It's an address, and perhaps
some more data, such as length. Which is how strings can be used in C,
by setting up a structure which includes an address, a length, and
perhaps some other data. Then all that's required is a library routine
to work with the string.
The difference in Basic is that the compiler will set up the descriptor.
There are way more differences between them...
In BASIC, you can return a string from a function. It is a type. In C
you cannot, as that would require that you could return an array of
unknown size. What you can return in C is a pointer, which can be
pointing to an array. But that is then an object that you might end of
with many references to, and whose scope you need to be careful about,
and you might need to keep track of ownership and eventual memory freeing.

Also, since strings are an actual type in BASIC, you can copy between
then, modify them, and so on, without worrying about unintentional side
effects. In C, as strings don't really exist, and you actually just have
pointers, you need to pay much more attention to what you are doing. If
you want to modify a string, you need to make a copy of it, and then
modify that one. You can also not easily take substrings from a string,
without risk of corrupting the original string.
Finally, since STRINGs in BASIC have size information, you cannot go
outside and unintentionally clobber random memory. As strings are just
array pointers in C, and pointer arithmetic can lead you anywhere, you
can address all memory when you think you are playing with your string.

Something like:

x = "abc"[7];

in C is a nice illustration of the "problem". If you could even express
this in BASIC that way, it would be an error because of being out of
range. In C, that is perfectly legal, and will give you something. Who
knows what.

All of this then are sources of potential bugs in C code that people
lament so often. :-)
Post by David Froble
Post by Johnny Billquist
So, obviously, dealing with strings is always going to be a ride.
(But I do like C, I just don't consider it to be the solution to all
the problems in the world.)
Most the discussions here, however, seem to focus too much on
languages, as if that is the problem and the solution. In the end, in
my experience, it's all about good programmers. Bad ones will create
problems no matter what language they write in. And unfortunately, bad
programmers outnumber good ones by a big margin, and it's only getting
worse as both academia and industry now thinks that the tools are the
solution to all problems.
Now there you got my 100% agreement.
Thanks.
I'd like to add that while having languages that helps getting rid of
some problems is useful, the fact is that no language or compiler can
figure out when a programmer writes semantic bugs. All they can catch is
syntactic bugs. So no tool in the world is ever going to be able to
catch the more serious and nefarious bugs. Bad programmers will continue
to be a headache, but academia and industry are now doing themselves a
disservice by making other people believe that there is a solution to
the problem of buggy programs, and they do not have to try and hire that
clever person who actually knows how to write code, but costs so much
money...

Oh well. Some companies do understand, which is enough for me...

Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
David Froble
2017-01-16 03:42:55 UTC
Permalink
Raw Message
Post by Johnny Billquist
There are way more differences between them...
In BASIC, you can return a string from a function.
Depends on how you look at it. I think that it's really just an address, that
the calling code knows is the address of a string descriptor. Haven't looked
too hard.
Post by Johnny Billquist
It is a type. In C
you cannot, as that would require that you could return an array of
unknown size. What you can return in C is a pointer, which can be
pointing to an array. But that is then an object that you might end of
with many references to, and whose scope you need to be careful about,
and you might need to keep track of ownership and eventual memory freeing.
I'm guessing that if the C calling code passes the address of a string
descriptor, and the called code knows that, then yes you could return a string.
Post by Johnny Billquist
Also, since strings are an actual type in BASIC, you can copy between
then, modify them, and so on, without worrying about unintentional side
effects. In C, as strings don't really exist, and you actually just have
pointers, you need to pay much more attention to what you are doing. If
you want to modify a string, you need to make a copy of it, and then
modify that one.
Maybe not, and no, I don't know, just speculating. See above.
Post by Johnny Billquist
You can also not easily take substrings from a string,
without risk of corrupting the original string.
Interesting claim. Why not?
Post by Johnny Billquist
Finally, since STRINGs in BASIC have size information, you cannot go
outside and unintentionally clobber random memory. As strings are just
array pointers in C, and pointer arithmetic can lead you anywhere, you
can address all memory when you think you are playing with your string.
x = "abc"[7];
in C is a nice illustration of the "problem". If you could even express
this in BASIC that way, it would be an error because of being out of
range. In C, that is perfectly legal, and will give you something. Who
knows what.
I sure don't have a clue.
Post by Johnny Billquist
All of this then are sources of potential bugs in C code that people
lament so often. :-)
I've always thought that the computer is there to do the grunt work. Thus, C
not having the constructs to do so, is I think not so good.
Jan-Erik Soderholm
2017-01-16 09:06:56 UTC
Permalink
Raw Message
Post by David Froble
Post by Johnny Billquist
There are way more differences between them...
In BASIC, you can return a string from a function.
Depends on how you look at it.
It has nothing to do with how *you* look at it.
It has everything to do with how the *compilers* looks at it!

C just look at it as an address/pointer, nothing else. It is
your C application logic that threats it as a "string".

Basic look at is as a "string". And your application logic
better also do that, or the compiler with throw some error.

Or in other words, the Basic compiler *knows* it is a string,
the C compiler does not.

What your application logic in your own code does, is in this
Post by David Froble
I think that it's really just an address,
that the calling code knows is the address of a string descriptor. Haven't
looked too hard.
Post by Johnny Billquist
It is a type. In C you cannot, as that would require that you could
return an array of unknown size. What you can return in C is a pointer,
which can be pointing to an array. But that is then an object that you
might end of with many references to, and whose scope you need to be
careful about, and you might need to keep track of ownership and eventual
memory freeing.
I'm guessing that if the C calling code passes the address of a string
descriptor, and the called code knows that, then yes you could return a string.
Still, *C* (as such) knows nothing about strings! It is just a "pointer
to a byte array". You can (in your code) do anything with it.

Basic knows it is a string, and you can only do string
operations on the returned value.

There is no error from the C compiler saying "Hey, you can not
do that since this is a string!".
Jan-Erik Soderholm
2017-01-16 09:13:23 UTC
Permalink
Raw Message
Was sent partly finished...
Complete post below...

Jan-Erik.
Post by Jan-Erik Soderholm
Post by David Froble
Post by Johnny Billquist
There are way more differences between them...
In BASIC, you can return a string from a function.
Depends on how you look at it.
It has nothing to do with how *you* look at it.
It has everything to do with how the *compilers* looks at it!
C just look at it as an address/pointer, nothing else. It is
your C application logic that threats it as a "string".
Basic look at is as a "string". And your application logic
better also do that, or the compiler with throw some error.
Or in other words, the Basic compiler *knows* it is a string,
the C compiler does not.
What your application logic in your own code does, is in this
What your application logic in your code does, is in this
regard uninteresting. The discussion was around differnces
in programming languages and language "built-ins".
Post by Jan-Erik Soderholm
Post by David Froble
I think that it's really just an address,
that the calling code knows is the address of a string descriptor. Haven't
looked too hard.
Post by Johnny Billquist
It is a type. In C you cannot, as that would require that you could
return an array of unknown size. What you can return in C is a pointer,
which can be pointing to an array. But that is then an object that you
might end of with many references to, and whose scope you need to be
careful about, and you might need to keep track of ownership and eventual
memory freeing.
I'm guessing that if the C calling code passes the address of a string
descriptor, and the called code knows that, then yes you could return a string.
Still, *C* (as such) knows nothing about strings! It is just a "pointer
to a byte array". You can (in your code) do anything with it.
Basic knows it is a string, and you can only do string
operations on the returned value.
There is no error from the C compiler saying "Hey, you can not
do that since this is a string!".
Richard Maher
2017-01-16 11:22:53 UTC
Permalink
Raw Message
Post by Jan-Erik Soderholm
It has nothing to do with how *you* look at it.
It has everything to do with how the *compilers* looks at it!
C just look at it as an address/pointer, nothing else. It is
your C application logic that threats it as a "string".
IIRC it is everything to do with how the Procedure Calling Standard
looks at it. To return a string from a function one must return the
descriptor in ARG1, again IIRC
Jan-Erik Soderholm
2017-01-16 13:32:04 UTC
Permalink
Raw Message
Post by Jan-Erik Soderholm
It has nothing to do with how *you* look at it.
It has everything to do with how the *compilers* looks at it!
C just look at it as an address/pointer, nothing else. It is
your C application logic that threats it as a "string".
IIRC it is everything to do with how the Procedure Calling Standard looks
at it. To return a string from a function one must return the descriptor in
ARG1, again IIRC
But that has little to do how a "string" is represented in the
programming languages "C" and "Basic" as such, not?

And how C and Basic handles strings, has nothing to do with
VMS either... :-) It is in the language standards.

That C on VMS specifialy has library support to be able to use
(string) descriptors from C, is also something else.
David Froble
2017-01-16 16:56:56 UTC
Permalink
Raw Message
Post by Jan-Erik Soderholm
Post by Jan-Erik Soderholm
It has nothing to do with how *you* look at it.
It has everything to do with how the *compilers* looks at it!
C just look at it as an address/pointer, nothing else. It is
your C application logic that threats it as a "string".
IIRC it is everything to do with how the Procedure Calling Standard looks
at it. To return a string from a function one must return the
descriptor in
ARG1, again IIRC
But that has little to do how a "string" is represented in the
programming languages "C" and "Basic" as such, not?
And how C and Basic handles strings, has nothing to do with
VMS either... :-) It is in the language standards.
One of the issues is that perhaps we're not working with things that adhere
strictly to some standard.

One example, from Basic. The concept of an address was not something that Basic
(Basic+, BP2, and VAX Basic) supported. Then one day the LOC() function was
introduced. With this, a programmer could get at things Basic never intended.

Is this good or bad? Good question. As far as I know, and it may be an
interesting topic, Basic had no way to use AST routines. I now use LOC() to get
the address of an AST routine, so that I can specify it when queuing an AST,
from Basic.

Just one example.
Jan-Erik Soderholm
2017-01-16 23:52:27 UTC
Permalink
Raw Message
Post by David Froble
Post by Jan-Erik Soderholm
Post by Jan-Erik Soderholm
It has nothing to do with how *you* look at it.
It has everything to do with how the *compilers* looks at it!
C just look at it as an address/pointer, nothing else. It is
your C application logic that threats it as a "string".
IIRC it is everything to do with how the Procedure Calling Standard looks
at it. To return a string from a function one must return the descriptor in
ARG1, again IIRC
But that has little to do how a "string" is represented in the
programming languages "C" and "Basic" as such, not?
And how C and Basic handles strings, has nothing to do with
VMS either... :-) It is in the language standards.
One of the issues is that perhaps we're not working with things that adhere
strictly to some standard.
There have never been a standard for a strict string data type in C.
Basic have more or less always had a string data type as far as I remember.
Post by David Froble
One example, from Basic. The concept of an address was not something that
Basic (Basic+, BP2, and VAX Basic) supported. Then one day the LOC()
function was introduced. With this, a programmer could get at things Basic
never intended.
Is this good or bad? Good question. As far as I know, and it may be an
interesting topic, Basic had no way to use AST routines. I now use LOC()
to get the address of an AST routine, so that I can specify it when queuing
an AST, from Basic.
Just one example.
That is just an example of an addditon in a specific Basic
implementation. It has nothing to do with any "standard".
Richard Maher
2017-01-17 01:34:46 UTC
Permalink
Raw Message
Post by Jan-Erik Soderholm
Post by Jan-Erik Soderholm
It has nothing to do with how *you* look at it.
It has everything to do with how the *compilers* looks at it!
C just look at it as an address/pointer, nothing else. It is
your C application logic that threats it as a "string".
IIRC it is everything to do with how the Procedure Calling Standard looks
at it. To return a string from a function one must return the
descriptor in
ARG1, again IIRC
But that has little to do how a "string" is represented in the
programming languages "C" and "Basic" as such, not?
Who gives a toss about C? It doesn't even support strings and instead
has an array of bytes.

Before Microsoft "invented" the CLR VMS had the procedure calling
standard and descriptors. As with the discussion on the DLM, descriptors
are a convention and it is up to individual languages whether or not to
follow the rules. Personally I think convention works and drive on the
same side of the road as everyone else. You do what you like.
Post by Jan-Erik Soderholm
And how C and Basic handles strings, has nothing to do with
VMS either... :-) It is in the language standards.
Look at a living language like Javascript which now supports full blown
classes and promises for virtual in-lining.
Post by Jan-Erik Soderholm
That C on VMS specifialy has library support to be able to use
(string) descriptors from C, is also something else.
Look VMS is dead unless VSI can quickly move from a port to
virtualization/cloud/web-server.
David Froble
2017-01-17 02:13:49 UTC
Permalink
Raw Message
Post by Richard Maher
Post by Jan-Erik Soderholm
Post by Jan-Erik Soderholm
It has nothing to do with how *you* look at it.
It has everything to do with how the *compilers* looks at it!
C just look at it as an address/pointer, nothing else. It is
your C application logic that threats it as a "string".
IIRC it is everything to do with how the Procedure Calling Standard looks
at it. To return a string from a function one must return the descriptor in
ARG1, again IIRC
But that has little to do how a "string" is represented in the
programming languages "C" and "Basic" as such, not?
Who gives a toss about C? It doesn't even support strings and instead
has an array of bytes.
Before Microsoft "invented" the CLR VMS had the procedure calling
standard and descriptors. As with the discussion on the DLM, descriptors
are a convention and it is up to individual languages whether or not to
follow the rules. Personally I think convention works and drive on the
same side of the road as everyone else. You do what you like.
Post by Jan-Erik Soderholm
And how C and Basic handles strings, has nothing to do with
VMS either... :-) It is in the language standards.
Look at a living language like Javascript which now supports full blown
classes and promises for virtual in-lining.
Post by Jan-Erik Soderholm
That C on VMS specifialy has library support to be able to use
(string) descriptors from C, is also something else.
Look VMS is dead unless VSI can quickly move from a port to
virtualization/cloud/web-server.
Huh! I thought the requirement was a port of Rdb?
Richard Maher
2017-01-17 04:08:04 UTC
Permalink
Raw Message
Post by David Froble
Huh! I thought the requirement was a port of Rdb?
Rdb is being ported to VMS x86. Whether or not it is ported to Windows
is up to Larry so it WON'T EVER happen!

I was explaining what needs to be done if VMS is to have a chance at
growing.
David Froble
2017-01-17 06:34:20 UTC
Permalink
Raw Message
Post by Richard Maher
Post by David Froble
Huh! I thought the requirement was a port of Rdb?
Rdb is being ported to VMS x86.
"is" sort of implies present tense.

So, you know something the rest of us don't know? I haven't seen any official
statements that say that.

I'd be surprised if it wasn't going to happen.

There is no x86 VMS to port to at this time.
Post by Richard Maher
Whether or not it is ported to Windows
is up to Larry so it WON'T EVER happen!
I was explaining what needs to be done if VMS is to have a chance at
growing.
From your perspective, of course ....
Richard Maher
2017-01-17 19:00:02 UTC
Permalink
Raw Message
Post by David Froble
Post by Richard Maher
Post by David Froble
Huh! I thought the requirement was a port of Rdb?
Rdb is being ported to VMS x86.
"is" sort of implies present tense.
So, you know something the rest of us don't know? I haven't seen any
official statements that say that.
I'd be surprised if it wasn't going to happen.
There is no x86 VMS to port to at this time.
See Kevin Duffy and grow a brain.
Post by David Froble
Post by Richard Maher
Whether or not it is ported to Windows is up to Larry so it WON'T EVER
happen!
I was explaining what needs to be done if VMS is to have a chance at
growing.
From your perspective, of course ....
No the customer/business perspective is they now are able to rid
themselves of the pretentious cock >$100/day System Managers that have
plagued their lives for 30 years.

"Please Sir! Can I have some more privilege to do my job?"

Oh but they know how to size page and swap files and can name 1/2 doze
sysgen parameters so they must be worth it?

DSC plus non-extortionate licensing models on commodity hardware is what
some find desirable for some reason.

1) Tight-arse entry level developer
2) McDonalds "regular" magnetic disk entry level
3) Sweet Spot medium business configuration
4) The Big-Fucker 2000

Spin up as many VMs as are required, continually upgraded, and DR
(Sydney, Melb, Tokyo) all for free. Why would anyone want that?

For over 20 years us Devs have been outsourced, off-shored, and reduced
to supplier-relationship management interface jockeys :-( Now you
prima-donnas get the chop and not before time!
David Froble
2017-01-17 19:27:34 UTC
Permalink
Raw Message
Post by Richard Maher
Post by David Froble
Post by Richard Maher
Post by David Froble
Huh! I thought the requirement was a port of Rdb?
Rdb is being ported to VMS x86.
"is" sort of implies present tense.
So, you know something the rest of us don't know? I haven't seen any
official statements that say that.
I'd be surprised if it wasn't going to happen.
There is no x86 VMS to port to at this time.
See Kevin Duffy and grow a brain.
Richard, could you explain that.
Post by Richard Maher
Post by David Froble
Post by Richard Maher
Whether or not it is ported to Windows is up to Larry so it WON'T EVER
happen!
I was explaining what needs to be done if VMS is to have a chance at
growing.
From your perspective, of course ....
No the customer/business perspective is they now are able to rid
themselves of the pretentious cock >$100/day System Managers that have
plagued their lives for 30 years.
Sorry to rain on your rant, but not one of our customers has the traditional
"System Manager". A few people who support the users, yes, and you'll never get
away from that.
Post by Richard Maher
"Please Sir! Can I have some more privilege to do my job?"
No privs allowed!
Post by Richard Maher
Oh but they know how to size page and swap files and can name 1/2 doze
sysgen parameters so they must be worth it?
None of that either. Consolidated Data people might from time to time check up
on such things, but, actually, I cannot remember the last time that happened.
Nice thing about VMS, it just works.
Post by Richard Maher
DSC plus non-extortionate licensing models on commodity hardware is what
some find desirable for some reason.
Well, catch up, some VSI people have mentioned no license fees and mandatory
support. Frankly, I'm not aware of any serious companies who would for-go
having support. We tell them it's required. Unless they want to be down for a
week or so ....
Post by Richard Maher
1) Tight-arse entry level developer
2) McDonalds "regular" magnetic disk entry level
3) Sweet Spot medium business configuration
4) The Big-Fucker 2000
Spin up as many VMs as are required, continually upgraded, and DR
(Sydney, Melb, Tokyo) all for free. Why would anyone want that?
For over 20 years us Devs have been outsourced, off-shored, and reduced
to supplier-relationship management interface jockeys :-( Now you
prima-donnas get the chop and not before time!
Richard, are you having a bad day? Maybe watch your blood pressure, you could
do yourself an injury.
Johnny Billquist
2017-01-17 19:36:38 UTC
Permalink
Raw Message
Post by Richard Maher
Post by Jan-Erik Soderholm
It has nothing to do with how *you* look at it.
It has everything to do with how the *compilers* looks at it!
C just look at it as an address/pointer, nothing else. It is
your C application logic that threats it as a "string".
IIRC it is everything to do with how the Procedure Calling Standard
looks at it. To return a string from a function one must return the
descriptor in ARG1, again IIRC
In C, you can treat it as anything you want, no matter what the
procedure calling standard thinks it is. That's C in a nutshell.

Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
David Froble
2017-01-16 16:45:22 UTC
Permalink
Raw Message
Post by Jan-Erik Soderholm
Post by David Froble
Post by Johnny Billquist
There are way more differences between them...
In BASIC, you can return a string from a function.
Depends on how you look at it.
It has nothing to do with how *you* look at it.
It has everything to do with how the *compilers* looks at it!
C just look at it as an address/pointer, nothing else. It is
your C application logic that threats it as a "string".
Basic look at is as a "string". And your application logic
better also do that, or the compiler with throw some error.
Or in other words, the Basic compiler *knows* it is a string,
the C compiler does not.
What your application logic in your own code does, is in this
Post by David Froble
I think that it's really just an address,
that the calling code knows is the address of a string descriptor.
Haven't
looked too hard.
Post by Johnny Billquist
It is a type. In C you cannot, as that would require that you could
return an array of unknown size. What you can return in C is a pointer,
which can be pointing to an array. But that is then an object that you
might end of with many references to, and whose scope you need to be
careful about, and you might need to keep track of ownership and eventual
memory freeing.
I'm guessing that if the C calling code passes the address of a string
descriptor, and the called code knows that, then yes you could return a string.
Still, *C* (as such) knows nothing about strings! It is just a "pointer
to a byte array". You can (in your code) do anything with it.
Basic knows it is a string, and you can only do string
operations on the returned value.
There is no error from the C compiler saying "Hey, you can not
do that since this is a string!".
I do not dispute what you write. It is within limits correct.

But some may choose to use multiple languages, and in such cases one must be
aware of how both handle things. Perhaps one nice thing about the VMS calling
standard is that it can corrupt people, and some of them may use some code that
if used in a single language would not be valid.

All I was doing was pointing out that what Johnny (I believe) said was
unsupported in a particular language could in fact be done.
Jan-Erik Soderholm
2017-01-16 23:59:06 UTC
Permalink
Raw Message
Post by David Froble
Post by Jan-Erik Soderholm
Post by David Froble
Post by Johnny Billquist
There are way more differences between them...
In BASIC, you can return a string from a function.
Depends on how you look at it.
It has nothing to do with how *you* look at it.
It has everything to do with how the *compilers* looks at it!
C just look at it as an address/pointer, nothing else. It is
your C application logic that threats it as a "string".
Basic look at is as a "string". And your application logic
better also do that, or the compiler with throw some error.
Or in other words, the Basic compiler *knows* it is a string,
the C compiler does not.
What your application logic in your own code does, is in this
Post by David Froble
I think that it's really just an address,
that the calling code knows is the address of a string descriptor. Haven't
looked too hard.
Post by Johnny Billquist
It is a type. In C you cannot, as that would require that you could
return an array of unknown size. What you can return in C is a pointer,
which can be pointing to an array. But that is then an object that you
might end of with many references to, and whose scope you need to be
careful about, and you might need to keep track of ownership and eventual
memory freeing.
I'm guessing that if the C calling code passes the address of a string
descriptor, and the called code knows that, then yes you could return a string.
Still, *C* (as such) knows nothing about strings! It is just a "pointer
to a byte array". You can (in your code) do anything with it.
Basic knows it is a string, and you can only do string
operations on the returned value.
There is no error from the C compiler saying "Hey, you can not
do that since this is a string!".
I do not dispute what you write. It is within limits correct.
But some may choose to use multiple languages, and in such cases one must
be aware of how both handle things...
And when it comes to strings, some langages (such as Basic) handles them
for you, while in others (such as C) you have to handle them yourself.
Post by David Froble
All I was doing was pointing out that what Johnny (I believe) said was
unsupported in a particular language...
Unsuported in the actual language standard, yes.
Post by David Froble
...could in fact be done.
And who have ever said anything else? By adding library routines you
can do virtualy anything in any language. But that is outside the
scope of the specific language standards.
Bob Koehler
2017-01-17 14:21:49 UTC
Permalink
Raw Message
Post by David Froble
I'm guessing that if the C calling code passes the address of a string
descriptor, and the called code knows that, then yes you could return a string.
The C compiler never generates a descriptor unless the programmer has
specifically done so.
John Reagan
2017-01-16 00:28:30 UTC
Permalink
Raw Message
Post by David Froble
Post by Johnny Billquist
Post by David Froble
Post by Stephen Hoffman
Post by John Reagan
From the C99 standard (sorry for the poor cut n paste on my phone).
A string is a contiguous sequence of characters terminated by and
including the first null character.The termmultibyte stringis
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string.A pointer to a string is a pointer to its initial
(lowest addressed)character.The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
Ayup. C string handling sometimes makes me long for BASIC.
(Somebody please check on David. I think he may have fainted there.)
Nope. Take more than that.
But, it's not the language, it's the compiler using decent library
routines. I could argue that Basic doesn't do anything with strings.
At least from the perspective that it's mostly or all library calls.
Partly, and partly not. BASIC do not really have this as libraries, even
thought the implementation might sit in some library. It's a part of the
language itself, and you cannot really separate the things in BASIC.
Z2$MAIN
1 1 Z1$ = "abc"
2 Z2$ = "123"
3 Z3$ = Z1$ + Z2$
4 End
Z2$MAIN
Generated code
0000: .PSECT $CODE
CFFC 0000: .WORD ^M<R2,R3,R4,R5,R6,R7,R8,R9,R10,R11,I>
52 FB AF 9E 0002: MOVAB .-3, R2
50 00000004 0G 9E 0006: MOVAB $PDATA+4, R0
51 50 D0 000D: MOVL R0, R1
00000000 GG 16 0010: JSB BAS$INIT_R8
FC AD FD AF 9E 0016: $L_1: MOVAB $L_1, -4(FP)
51 03 32 001B: CVTWL #3, R1
52 00000063 0G 9E 001E: MOVAB $PDATA+99, R2
50 5B AB 7E 0025: MOVAQ Z1$(R11), R0
00000000 GG 16 0029: JSB STR$COPY_R_R8
51 03 32 002F: CVTWL #3, R1
52 00000060 0G 9E 0032: MOVAB $PDATA+96, R2
50 63 AB 7E 0039: MOVAQ Z2$(R11), R0
00000000 GG 16 003D: JSB STR$COPY_R_R8
63 AB 7F 0043: PUSHAQ Z2$(R11)
5B AB 7F 0046: PUSHAQ Z1$(R11)
6B AB 7F 0049: PUSHAQ Z3$(R11)
00000000 GG 03 FB 004C: CALLS #3, STR$CONCAT
50 00000004 0G 9E 0053: MOVAB $PDATA+4, R0
00000000 GG 16 005A: JSB BAS$END_R8
50 01 D0 0060: MOVL #1, R0
04 0063: RET
0064: .END
Some snipping to attempt to make it fit and readable.
Note that the compiler does not produce any code to do any of the operations.
All it's doing is pushing arguments and invoking a library routine. This is an
example of what I've tried to say when I claim that Basic doesn't really do much
of the work.
And a John Reagan question. Why is there a RET at the end of a main? It's not
a subroutine.
It isn't a "main" routine. If you want that to be the entry point, you have to put the name in the ".end" directive

.end z2$main

And even main routines return. How do you think the final status from your program gets into the $STATUS DCL symbol?
David Froble
2017-01-16 03:45:31 UTC
Permalink
Raw Message
Post by John Reagan
Post by David Froble
Post by Johnny Billquist
Post by David Froble
Post by Stephen Hoffman
Post by John Reagan
From the C99 standard (sorry for the poor cut n paste on my phone).
A string is a contiguous sequence of characters terminated by and
including the first null character.The termmultibyte stringis
sometimes used instead to emphasize special processing given to
multibyte characters contained in the string or to avoid confusion
with a wide string.A pointer to a string is a pointer to its initial
(lowest addressed)character.The length of a string is the number of
bytes preceding the null character and the value of a string is the
sequence of the values of the contained characters, in order.
Ayup. C string handling sometimes makes me long for BASIC.
(Somebody please check on David. I think he may have fainted there.)
Nope. Take more than that.
But, it's not the language, it's the compiler using decent library
routines. I could argue that Basic doesn't do anything with strings.
At least from the perspective that it's mostly or all library calls.
Partly, and partly not. BASIC do not really have this as libraries, even
thought the implementation might sit in some library. It's a part of the
language itself, and you cannot really separate the things in BASIC.
Z2$MAIN
1 1 Z1$ = "abc"
2 Z2$ = "123"
3 Z3$ = Z1$ + Z2$
4 End
Z2$MAIN
Generated code
0000: .PSECT $CODE
CFFC 0000: .WORD ^M<R2,R3,R4,R5,R6,R7,R8,R9,R10,R11,I>
52 FB AF 9E 0002: MOVAB .-3, R2
50 00000004 0G 9E 0006: MOVAB $PDATA+4, R0
51 50 D0 000D: MOVL R0, R1
00000000 GG 16 0010: JSB BAS$INIT_R8
FC AD FD AF 9E 0016: $L_1: MOVAB $L_1, -4(FP)
51 03 32 001B: CVTWL #3, R1
52 00000063 0G 9E 001E: MOVAB $PDATA+99, R2
50 5B AB 7E 0025: MOVAQ Z1$(R11), R0
00000000 GG 16 0029: JSB STR$COPY_R_R8
51 03 32 002F: CVTWL #3, R1
52 00000060 0G 9E 0032: MOVAB $PDATA+96, R2
50 63 AB 7E 0039: MOVAQ Z2$(R11), R0
00000000 GG 16 003D: JSB STR$COPY_R_R8
63 AB 7F 0043: PUSHAQ Z2$(R11)
5B AB 7F 0046: PUSHAQ Z1$(R11)
6B AB 7F 0049: PUSHAQ Z3$(R11)
00000000 GG 03 FB 004C: CALLS #3, STR$CONCAT
50 00000004 0G 9E 0053: MOVAB $PDATA+4, R0
00000000 GG 16 005A: JSB BAS$END_R8
50 01 D0 0060: MOVL #1, R0
04 0063: RET
0064: .END
Some snipping to attempt to make it fit and readable.
Note that the compiler does not produce any code to do any of the operations.
All it's doing is pushing arguments and invoking a library routine. This is an
example of what I've tried to say when I claim that Basic doesn't really do much
of the work.
And a John Reagan question. Why is there a RET at the end of a main? It's not
a subroutine.
It isn't a "main" routine. If you want that to be the entry point, you have to put the name in the ".end" directive
.end z2$main
And even main routines return. How do you think the final status from your program gets into the $STATUS DCL symbol?
Thanks. I thought it might have something to do with the exit status of the
program.
Arne Vajhøj
2017-01-15 22:30:03 UTC
Permalink
Raw Message
Post by Johnny Billquist
Most the discussions here, however, seem to focus too much on languages,
as if that is the problem and the solution. In the end, in my
experience, it's all about good programmers. Bad ones will create
problems no matter what language they write in. And unfortunately, bad
programmers outnumber good ones by a big margin, and it's only getting
worse as both academia and industry now thinks that the tools are the
solution to all problems.
Good programmers produce good code in any language.

Bad programmers produce good code in any language.

But neither good or bad programmers are the norm.

The big group is the mediocre programmer.

A good language (for type of programming that requires large number
of programmers) is one that enable mediocre programmer to produce
good code.

And a bad language (with same disclaimer) is one that result in
mediocre programmers producing bad code.

Arne
Stephen Hoffman
2017-01-13 22:34:54 UTC
Permalink
Raw Message
In some projects, we've #ifdef'd calls we know are risky and that have
accordingly been (locally) deprecated, but null handling in C can be...
interesting to get right.
Of course, there's no reason why the string handling library you use in
C has to have null terminated strings at the core of it's
implementation.
Null-terminated strings with strn calls and then drop in a null
terminator, or strl calls where available. Scanning tools (and
compilation warnings, where available) can help here, too. So can
higher-level frameworks, which can avoid this/

One of the more famous discussions of str and strl calls:
https://www.sudo.ws/todd/papers/strlcpy.pdf There are others.

Dealing with null pointers and type punning and UB is where things get
more interesting with C programs, though. More so than the
null-terminated strings.

For anyone that might suggest descriptors as an alternative to C
strings, those were a great idea when they were invented, but I'd much
rather be dealing with a garbage-collected framework and string
objects. Way better, way more flexible, less code, and easy to update
parts of the API without also having to rewrite calling source code.

If limited to using string descriptors, then I'd really rather use
descriptors that have some provision for referencing the character
encoding associated with the string, too. But descriptors in C still
means shuffling pointers around with all the "fun" that such efforts
inevitably encounter. Descriptor support in C on OpenVMS results in a
pile of glue code, particularly given the inexplicable lack of embedded
support for using descriptors anywhere in the OpenVMS C library. It's
all calls to str$ or lib$ or such, and all managed by the developer and
not by C itself underneath.
--
Pure Personal Opinion | HoffmanLabs LLC
Arne Vajhøj
2017-01-14 01:29:30 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by Stephen Hoffman
In some projects, we've #ifdef'd calls we know
are risky and that have accordingly been (locally) deprecated, but null
handling in C can be... interesting to get right.
Of course, there's no reason why the string handling library you
use in C has to have null terminated strings at the core of it's
implementation.
No.

But the one that comes with C has null terminated strings.

And the vast majority of third party libraries expect
null terminated strings.

Hard to avoid.

Arne
Stephen Hoffman
2017-01-17 16:44:24 UTC
Permalink
Raw Message
Post by Arne Vajhøj
In some projects, we've #ifdef'd calls we know are risky and that have
accordingly been (locally) deprecated, but null handling in C can be...
interesting to get right.
Of course, there's no reason why the string handling library you use in
C has to have null terminated strings at the core of it's
implementation.
No.
But the one that comes with C has null terminated strings.
And the vast majority of third party libraries expect
null terminated strings.
Hard to avoid.
Arne
Ayup, The C compiler and library on OpenVMS never got around to
actually integrating support descriptors, too. Which means calling
the RTL or — as is often the case — piles of glue code serving both as
visual chaff and fodder for introducing bugs.

And again, it was not null-terminated strings I was referencing, it was
to null pointers. Pointers to descriptors can be null, too.
Pointers in the descriptors can also be null. Then there's the extra
code needed to deal with 32-bit and 64-bit descriptors. And all the
code necessary to deal with the memory allocations, which is
inconsistently implemented across the various languages and within
OpenVMS, and tracking all of that usage. (Does BASIC even do garbage
collection? AFAIK, it doesn't.) Then there's the lack of tags for
the character encoding, a problem which hits all of the languages as
you start to drag the code forward from the ASCII and DEC MCS era.
Dealing with error handling from the language or system memory
management, including those pesky rogue and null pointers and buffer
overruns and the rest. Which is also about where I start remembering
anew that reference-counted objects are far easier than dealing with
much of this minutiae, and where I start pondering whether porting code
to Rust or such is more appropriate for new work.

Null-terminated strings — ASCIZ strings, as Macro32 calls them — are
scattered around OpenVMS. So are counted ASCII strings; ASCIC. That
all can work, but do any of us want to deal with ASCIZ or ASCIC or
ASCID descriptors directly? Probably not. But that's where you are,
with Macro32 and Bliss and C. The easier those tasks get and the
easier and more reliable memory management gets — fewer leaks, etc —
the better all our code gets. That's where I start looking at
compiler enhancements, and at switching languages.
--
Pure Personal Opinion | HoffmanLabs LLC
John Reagan
2017-01-11 18:16:45 UTC
Permalink
Raw Message
Post by Robert A. Brooks
Post by u***@gmail.com
Post by Neil Rieck
This hit my INBOX today from Sue Skonetski at VSI. Enjoy!
http://vmssoftware.com/pdfs/State_of_Port_20170105.pdf
nice to see Sue still with OpenVMS. I wish the developers
would get away from the c code as this always causes problems
later.
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Anyway, my retirement plan is predicated on rewriting the shadowing
driver in C, thereby earning the Reagan Bucks for MACRO-->C conversions.
--
-- Rob
I've seen the shadow driver code. I would strongly encourage investing the "Reagan Bucks" for anti-psychotic drugs or at least a good bottle of bourbon.
u***@gmail.com
2017-01-11 19:05:02 UTC
Permalink
Raw Message
Post by John Reagan
Post by Robert A. Brooks
Post by u***@gmail.com
Post by Neil Rieck
This hit my INBOX today from Sue Skonetski at VSI. Enjoy!
http://vmssoftware.com/pdfs/State_of_Port_20170105.pdf
nice to see Sue still with OpenVMS. I wish the developers
would get away from the c code as this always causes problems
later.
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Anyway, my retirement plan is predicated on rewriting the shadowing
driver in C, thereby earning the Reagan Bucks for MACRO-->C conversions.
--
-- Rob
I've seen the shadow driver code. I would strongly encourage investing the "Reagan Bucks" for anti-psychotic drugs or at least a good bottle of bourbon.
sounds like his retirement will be short lived :)
Michael Moroney
2017-01-11 20:21:55 UTC
Permalink
Raw Message
Post by John Reagan
Post by Robert A. Brooks
Post by u***@gmail.com
Post by Neil Rieck
This hit my INBOX today from Sue Skonetski at VSI. Enjoy!
http://vmssoftware.com/pdfs/State_of_Port_20170105.pdf
nice to see Sue still with OpenVMS. I wish the developers
would get away from the c code as this always causes problems
later.
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Anyway, my retirement plan is predicated on rewriting the shadowing
driver in C, thereby earning the Reagan Bucks for MACRO-->C conversions.
I've seen the shadow driver code. I would strongly encourage investing the
"Reagan Bucks" for anti-psychotic drugs or at least a good bottle of
bourbon.
I've *worked* on the shadow driver code. I need the anti-psychotic drugs,
and no, bourbon won't be enough.
David Froble
2017-01-11 22:15:54 UTC
Permalink
Raw Message
Post by John Reagan
Post by Robert A. Brooks
Post by u***@gmail.com
Post by Neil Rieck
This hit my INBOX today from Sue Skonetski at VSI. Enjoy!
http://vmssoftware.com/pdfs/State_of_Port_20170105.pdf
nice to see Sue still with OpenVMS. I wish the developers
would get away from the c code as this always causes problems
later.
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Anyway, my retirement plan is predicated on rewriting the shadowing
driver in C, thereby earning the Reagan Bucks for MACRO-->C conversions.
--
-- Rob
I've seen the shadow driver code. I would strongly encourage investing the "Reagan Bucks" for anti-psychotic drugs or at least a good bottle of bourbon.
But, it works now, right? If so, then I can think of other more worthwhile
tasks. If it ain't broke, don't fix it.
Simon Clubley
2017-01-12 01:25:00 UTC
Permalink
Raw Message
Post by David Froble
Post by John Reagan
I've seen the shadow driver code. I would strongly encourage investing the "Reagan Bucks" for anti-psychotic drugs or at least a good bottle of bourbon.
But, it works now, right? If so, then I can think of other more worthwhile
tasks. If it ain't broke, don't fix it.
Working and maintainable are two different things. The terminal
driver kernel code comes to mind here.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
David Froble
2017-01-12 02:00:22 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by David Froble
Post by John Reagan
I've seen the shadow driver code. I would strongly encourage investing the "Reagan Bucks" for anti-psychotic drugs or at least a good bottle of bourbon.
But, it works now, right? If so, then I can think of other more worthwhile
tasks. If it ain't broke, don't fix it.
Working and maintainable are two different things. The terminal
driver kernel code comes to mind here.
Simon.
A prime example of "just let it alone". Terminals aren't the major method of
interacting with a computer any more. Lots of other stuff now. If it was to be
replaced, many of the features it now has would not make it to the new version.

Yes, yes, I know you want some things in the area of recall buffer, editing
lines, and such, but that doesn't bring in the money.
Stephen Hoffman
2017-01-12 16:38:59 UTC
Permalink
Raw Message
Post by David Froble
A prime example of "just let it alone". Terminals aren't the major
method of interacting with a computer any more. Lots of other stuff
now. If it was to be replaced, many of the features it now has would
not make it to the new version.
Yes, yes, I know you want some things in the area of recall buffer,
editing lines, and such, but that doesn't bring in the money.
Ayup. An approach which eventually and inevitably catches up with
you, as the tools and user interfaces and the effort involved in
creating same all tend to drift away from what folks expect, and
somebody else shows up with a better approach.

If you're not actively cannibalizing your own, then some other
competitor will invariably decide for you.

Figuring out what's important and what to add or replace or
cannibalize? What to defer or to just deprecate? That trade-off and
that decision is tougher.

Strangling and replacing and deprecating and removing the terminal
driver with something that operates more cleanly with modern
network-connected color-native UTF-8-capable terminal emulators would
be my choice here, though that's further down the proverbial whiteboard
than more than a few other projects.
--
Pure Personal Opinion | HoffmanLabs LLC
Robert A. Brooks
2017-01-12 02:19:29 UTC
Permalink
Raw Message
Post by David Froble
Post by John Reagan
Post by Robert A. Brooks
Post by u***@gmail.com
Post by Neil Rieck
This hit my INBOX today from Sue Skonetski at VSI. Enjoy!
http://vmssoftware.com/pdfs/State_of_Port_20170105.pdf
nice to see Sue still with OpenVMS. I wish the developers
would get away from the c code as this always causes problems
later.
Let me know when you find a C-language related problem with the
Multipath execlet, which is 100% written in C.
Anyway, my retirement plan is predicated on rewriting the shadowing
driver in C, thereby earning the Reagan Bucks for MACRO-->C conversions.
--
-- Rob
I've seen the shadow driver code. I would strongly encourage investing the
"Reagan Bucks" for anti-psychotic drugs or at least a good bottle of bourbon.
But, it works now, right? If so, then I can think of other more worthwhile
tasks. If it ain't broke, don't fix it.
I didn't think I needed to add a smiley face to signal sarcasm to indicate that
I really was just kidding.
--
-- Rob
Simon Clubley
2017-01-12 00:38:50 UTC
Permalink
Raw Message
Post by u***@gmail.com
nice to see Sue still with OpenVMS. I wish the developers
would get away from the c code as this always causes problems
later.
What programming language would you suggest instead for new kernel
level code ?

You need to choose something which is suitable for use in a
multi-architecture world and which you know is going to be
still around and supported in a decade or two.

Macro is therefore an absolute no-no and Bliss isn't far behind.

I am on record as preferring Ada, Pillar or a general Wirth style
language but I also recognise that this is extremely unlikely for
VMS for various practical reasons.

You can't use a language of the month such as Rust because it
hasn't been around long enough and it doesn't have a formal
ISO style support structure behind it so you can't be sure it
will still be available in 5-10 years time. (As Stephen likes to
point out, you are not planning for the year 2017, but the
year 2022 or 2027.)

As a secondary note, the SJW nature of the Rust community makes
me nervous as well, because all it takes is for someone to do
something stupid in a year or two for the community to split.
(Look at the current farce around Libreboot for an example.)

So you are left with C, not because it's the best language,
(it most certainly is not) but because there's nothing else
which seems to be suitable.

So Bob, which language would you prefer instead for new kernel
level code ?

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Arne Vajhøj
2017-01-12 01:58:58 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by u***@gmail.com
nice to see Sue still with OpenVMS. I wish the developers
would get away from the c code as this always causes problems
later.
What programming language would you suggest instead for new kernel
level code ?
You need to choose something which is suitable for use in a
multi-architecture world and which you know is going to be
still around and supported in a decade or two.
Macro is therefore an absolute no-no and Bliss isn't far behind.
I am on record as preferring Ada, Pillar or a general Wirth style
language but I also recognise that this is extremely unlikely for
VMS for various practical reasons.
You can't use a language of the month such as Rust because it
hasn't been around long enough and it doesn't have a formal
ISO style support structure behind it so you can't be sure it
will still be available in 5-10 years time. (As Stephen likes to
point out, you are not planning for the year 2017, but the
year 2022 or 2027.)
As a secondary note, the SJW nature of the Rust community makes
me nervous as well, because all it takes is for someone to do
something stupid in a year or two for the community to split.
(Look at the current farce around Libreboot for an example.)
So you are left with C, not because it's the best language,
(it most certainly is not) but because there's nothing else
which seems to be suitable.
So Bob, which language would you prefer instead for new kernel
level code ?
C is probably about as high level abstraction you can go for the
real core code.

But for the majority of the code you can go higher. If you want
something both widely used but still mature then C++ would be
a strong contender.

Arne
u***@gmail.com
2017-01-14 19:33:13 UTC
Permalink
Raw Message
Post by Simon Clubley
Post by u***@gmail.com
nice to see Sue still with OpenVMS. I wish the developers
would get away from the c code as this always causes problems
later.
What programming language would you suggest instead for new kernel
level code ?
You need to choose something which is suitable for use in a
multi-architecture world and which you know is going to be
still around and supported in a decade or two.
Macro is therefore an absolute no-no and Bliss isn't far behind.
I am on record as preferring Ada, Pillar or a general Wirth style
language but I also recognise that this is extremely unlikely for
VMS for various practical reasons.
You can't use a language of the month such as Rust because it
hasn't been around long enough and it doesn't have a formal
ISO style support structure behind it so you can't be sure it
will still be available in 5-10 years time. (As Stephen likes to
point out, you are not planning for the year 2017, but the
year 2022 or 2027.)
As a secondary note, the SJW nature of the Rust community makes
me nervous as well, because all it takes is for someone to do
something stupid in a year or two for the community to split.
(Look at the current farce around Libreboot for an example.)
So you are left with C, not because it's the best language,
(it most certainly is not) but because there's nothing else
which seems to be suitable.
So Bob, which language would you prefer instead for new kernel
level code ?
Simon.
--
Microsoft: Bringing you 1980s technology to a 21st century world
ada is fine, even dibol would be better than c :)
Simon Clubley
2017-01-14 20:03:03 UTC
Permalink
Raw Message
Post by u***@gmail.com
Post by Simon Clubley
So Bob, which language would you prefer instead for new kernel
level code ?
ada is fine, even dibol would be better than c :)
I agree Ada is a good language for kernel use, but as I've already
mentioned it's very unlikely to be used in the VMS kernel for any
new work due to various practical reasons, including the compiler
availability issues.

However, as you should well know, Dibol is _totally_ unsuitable
for kernel mode code.

One regret that I have is that DEC never pushed the Pillar work
to completion.

The language had a couple of things which should have been fixed
(including around register access IIRC) but at the time Pillar was
being developed within DEC there was a market opportunity to
establish it alongside C, especially if DEC didn't try locking down
the use of the language and instead made the language specification
freely usable.

If it had taken off, we would have still had C, but we would now
also have had a viable alternative to C in ways that we don't
currently have.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Neil Rieck
2017-01-13 12:24:28 UTC
Permalink
Raw Message
Post by u***@gmail.com
Post by Neil Rieck
This hit my INBOX today from Sue Skonetski at VSI. Enjoy!
http://vmssoftware.com/pdfs/State_of_Port_20170105.pdf
Neil Rieck
Waterloo, Ontario, Canada.
http://www3.sympatico.ca/n.rieck/
nice to see Sue still with OpenVMS. I wish the developers
would get away from the c code as this always causes problems
later.
Not sure why you would say this. Both "C" and C++ are like sharp razors. In the hands of a professional programmer they can be as useful as a surgeon's scalpel. In the hands of a non-professional you can use them to cutoff your own head.

We all know that "C" began life at Bell Labs as a portable assembler but its first claim-to-fame was removing most of the human-caused bugs present in UNIX. UNIX was supplanted by Linux which literally runs 50% of all digital devices on planet Earth. What is less known is that Microsoft stabilized its whole software inventory by converting everything to C or C++ (they only use C++ but much of their code doesn't use any of the C++ extensions).

http://www3.sympatico.ca/n.rieck/docs/technological_change.html#epiphany2

Click here: http://www.stroustrup.com/applications.html then type control-F then find MICROSOFT. Quote: Literally everything at Microsoft is [now] built using recent flavors of Visual C++

Neil Rieck
Waterloo, Ontario, Canada.
http://www3.sympatico.ca/n.rieck/
Arne Vajhøj
2017-01-13 14:18:16 UTC
Permalink
Raw Message
On Wednesday, January 11, 2017 at 10:48:58 AM UTC-5,
I wish the developers would get
away from the c code as this always causes problems later.
Not sure why you would say this. Both "C" and C++ are like sharp
razors. In the hands of a professional programmer they can be as
useful as a surgeon's scalpel. In the hands of a non-professional you
can use them to cutoff your own head.
Two notes.

Modern style C++ are a lot safer than C and 1980's style C++.

I don't see it as professional vs non-professional. I see it as
good vs mediocre vs bad. With a 10%-80%-10% distribution. And it
is a real problem. If C was used for only writing OS kernels then
no problem - you can hire among the top 10%. But if you want to
use it for much wider business applications, then it is difficult to
stay within the top 10% and as soon as you go to the remaining
90% then problems tend to start show up.

Arne
Loading...