Discussion:
Memory Safe Programming Languages
(too old to reply)
Stephen Hoffman
2024-03-06 23:57:47 UTC
Permalink
Recent US Goverment recommendations on programming:

https://www.whitehouse.gov/wp-content/uploads/2024/02/Final-ONCD-Technical-Report.pdf


From a related document from US NSA: "Examples of memory safe language
include Python, Java, C#, Go, Delphi/Object Pascal, Swift, Ruby, Rust,
and Ada."

https://media.defense.gov/2023/Apr/27/2003210083/-1/-1/0/CSI_SOFTWARE_MEMORY_SAFETY_V1.1.PDF


OpenVMS has three of those languages available, so there's that.
(Python, Java, and the available Pascal probably also counts.)
--
Pure Personal Opinion | HoffmanLabs LLC
Lawrence D'Oliveiro
2024-03-07 00:32:03 UTC
Permalink
Post by Stephen Hoffman
https://www.whitehouse.gov/wp-content/uploads/2024/02/Final-ONCD-Technical-Report.pdf
Big discussion about this over on comp.lang.c. It’s clear some see this
kind of recommendation as a threat.
bill
2024-03-07 01:12:35 UTC
Permalink
Post by Stephen Hoffman
https://www.whitehouse.gov/wp-content/uploads/2024/02/Final-ONCD-Technical-Report.pdf
From a related document from US NSA: "Examples of memory safe language
include Python, Java, C#, Go, Delphi/Object Pascal, Swift, Ruby, Rust,
and Ada."
https://media.defense.gov/2023/Apr/27/2003210083/-1/-1/0/CSI_SOFTWARE_MEMORY_SAFETY_V1.1.PDF
OpenVMS has three of those languages available, so there's that.
(Python, Java, and the available Pascal probably also counts.)
And 40 years ago we had safe C. We all know how well that
survived. If people weren't willing to choose memory safety
back then, why would they be expected to now?

Hmmm... I don't see Jovial on that list. Are they going to
try and force the Air Force to use Ada again?

bill
Lawrence D'Oliveiro
2024-03-07 01:42:43 UTC
Permalink
Post by bill
And 40 years ago we had safe C. We all know how well that
survived.
MISRA still is in production use today.
Post by bill
Are they going to try and force the Air Force to use Ada again?
Did it ever stop?

The life-support system on the International Space Station was written
in Ada. And then there is SPARK
<https://devclass.com/2022/11/08/spark-as-good-as-rust-for-safer-coding-adacore-cites-nvidia-case-study/>,
which is producing a subset of Ada with even stronger correctness
properties. I gather their aim is ultimately to include the whole of
Ada in that set.
Simon Clubley
2024-03-07 18:41:56 UTC
Permalink
Post by bill
And 40 years ago we had safe C. We all know how well that
survived. If people weren't willing to choose memory safety
back then, why would they be expected to now?
There's no such thing as a "safe" language.

What there is are "safer" languages in which it is a lot harder to
make accidental mistakes, and harder for accidental mistakes you do
make to remain undetected, especially if you use the full capabilities
of the language.

For one really simple example, don't just try to write C code using
Ada syntax, and place everything in plain Integers, but use the full
data type modelling capabilities of the language.

Also, use ranged data types to constrain the allowed values (which was
something that Rust couldn't properly do the last time I checked;
attempts to implement this in Rust were part of some addon library,
not part of the core language).

The recommendation is to switch to using these "safer" languages, not
some mythical "safe" language.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
bill
2024-03-07 20:50:29 UTC
Permalink
Post by Simon Clubley
Post by bill
And 40 years ago we had safe C. We all know how well that
survived. If people weren't willing to choose memory safety
back then, why would they be expected to now?
There's no such thing as a "safe" language.
What there is are "safer" languages in which it is a lot harder to
make accidental mistakes, and harder for accidental mistakes you do
make to remain undetected, especially if you use the full capabilities
of the language.
For one really simple example, don't just try to write C code using
Ada syntax, and place everything in plain Integers, but use the full
data type modelling capabilities of the language.
Also, use ranged data types to constrain the allowed values (which was
something that Rust couldn't properly do the last time I checked;
attempts to implement this in Rust were part of some addon library,
not part of the core language).
The recommendation is to switch to using these "safer" languages, not
some mythical "safe" language.
But my argument is that C had the chance to be one of those
"safer" languages. Users rejected it. Have to wonder why.

And, on another note regarding C and Ada. The original GNAT
compiler converted Ada into C and compiled it with GCC. Now,
it seems to me that points at two possible concepts. One is
that if Ada can be done in C then it has all the same flaws
and warts. Not sure I would like to go in that direction.
The other is much more interesting. And that is the concept
that C can, obviously, be just as safe as Ada. The question
then becomes why isn't it? See my first paragraph. :-)

bill
Lawrence D'Oliveiro
2024-03-07 21:07:22 UTC
Permalink
And that is the concept that C can, obviously, be just as safe as Ada.
Not without help, though: namely, the constraints imposed by an Ada
compiler.
John Dallman
2024-03-08 10:07:00 UTC
Permalink
Post by Lawrence D'Oliveiro
And that is the concept that C can, obviously, be just as safe as Ada.
Not without help, though: namely, the constraints imposed by an Ada
compiler.
Indeed. Looking at the levels of the implementation can be helpful:

We don't have memory-safe instruction sets. The idea isn't impossible,
but it would be a lot more complex and/or restrictive than any of the
currently popular instruction sets.

We can implement memory-safer languages on top of unsafe instruction sets.
But most of the memory safety comes from the restrictions of the language:
if you take the machine code version of a program written in a safer
language, it is not obvious from inspection that it is safer, and proving
that it is safe is impossible (see the halting problem).

The same applies to compiling a memory-safer language (ADA) into a
memory-unsafe language (C). The resulting C is memory-safer, but this
isn't obvious from the code and isn't provable.

John
Scott Dorsey
2024-03-08 14:00:11 UTC
Permalink
Post by John Dallman
We don't have memory-safe instruction sets. The idea isn't impossible,
but it would be a lot more complex and/or restrictive than any of the
currently popular instruction sets.
iAPX 432.

I see a great need.
--scott
--
"C'est un Nagra. C'est suisse, et tres, tres precis."
John Dallman
2024-03-08 20:35:00 UTC
Permalink
Post by Scott Dorsey
Post by John Dallman
We don't have memory-safe instruction sets. The idea isn't
impossible, but it would be a lot more complex and/or restrictive
than any of the currently popular instruction sets.
iAPX 432.
What the hell. I've collected various PDFs and will read up on it.

John
Simon Clubley
2024-03-11 13:20:26 UTC
Permalink
Post by John Dallman
Post by Scott Dorsey
Post by John Dallman
We don't have memory-safe instruction sets. The idea isn't
impossible, but it would be a lot more complex and/or restrictive
than any of the currently popular instruction sets.
iAPX 432.
What the hell. I've collected various PDFs and will read up on it.
Also note what primary language it used. :-) It was a good idea, but the
technology of the time simply was not yet up to it. Reminds me of the
1993 Newton compared to the PDAs we had as standard a decade or so later.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Lawrence D'Oliveiro
2024-03-08 20:44:49 UTC
Permalink
Post by John Dallman
We don't have memory-safe instruction sets.
The CHERI project is reviving the old “capability” idea, which might help.
Arm’s “Morello” research chip is part of that
<https://www.theregister.com/2022/07/26/cheri_computer_runs_kde/>.
Post by John Dallman
The same applies to compiling a memory-safer language (ADA) into a
memory-unsafe language (C). The resulting C is memory-safer, but this
isn't obvious from the code and isn't provable.
If the original language is provably safe, that should carry over into the
code it generates.
Arne Vajhøj
2024-03-07 21:37:18 UTC
Permalink
Post by bill
Post by Simon Clubley
And 40 years ago we had safe C.  We all know how well that
survived.  If people weren't willing to choose memory safety
back then, why would they be expected to now?
There's no such thing as a "safe" language.
The recommendation is to switch to using these "safer" languages, not
some mythical "safe" language.
But my argument is that C had the chance to be one of those
"safer" languages.  Users rejected it.  Have to wonder why.
Being memory safe does not work for some C usage (direct HW access).

And we don't know much about the implementation quality of that
80's C compiler you keep referring to. Even a good idea can be
fucked up by a bad implementation.

Or maybe the time was not ready for it then but is now. The
p-code idea was a not a big success back then, but today
the same concept is more widely used than compiling to
native code.
Post by bill
And, on another note regarding C and Ada.  The original GNAT
compiler converted Ada into C and compiled it with GCC.
Like GnuCOBOL today?

I thought Gnat always worked like other GCC compilers.
Post by bill
  Now,
it seems to me that points at two possible concepts.  One is
that if Ada can be done in C then it has all the same flaws
and warts.  Not sure I would like to go  in that direction.
The other is much more interesting.  And that is the concept
that C can, obviously, be just as safe as Ada.  The question
then becomes why isn't it?  See my first paragraph.   :-)
I don't think that logic is true.

The language level of safety very much depend on the
definition of the language.

If language X is transpiled into language Y (instead
of compiled to native object code), then it is very
much possible for X compiler to prevent something that
Y compiler allows. X can be memory safe even though Y is
not.

Arne
Lawrence D'Oliveiro
2024-03-07 23:29:40 UTC
Permalink
If language X is transpiled into language Y (instead of compiled to
native object code), then it is very much possible for X compiler to
prevent something that Y compiler allows. X can be memory safe even
though Y is not.
The same applies very much to machine code, of course. That’s why we don’t
need a separate term “transpile” to distinguish the process from what
“compile” does.
Arne Vajhøj
2024-03-07 23:48:07 UTC
Permalink
Post by Lawrence D'Oliveiro
If language X is transpiled into language Y (instead of compiled to
native object code), then it is very much possible for X compiler to
prevent something that Y compiler allows. X can be memory safe even
though Y is not.
The same applies very much to machine code, of course. That’s why we don’t
need a separate term “transpile” to distinguish the process from what
“compile” does.
The convention (today) is:

compile = transform to lower level language
transpile = transform to same level language

It is what it is.

Arne
Lawrence D'Oliveiro
2024-03-08 00:51:41 UTC
Permalink
Post by Arne Vajhøj
compile = transform to lower level language
transpile = transform to same level language
But C is at a lower level than Ada.
Simon Clubley
2024-03-08 13:14:59 UTC
Permalink
Post by bill
But my argument is that C had the chance to be one of those
"safer" languages. Users rejected it. Have to wonder why.
Did it fix only one special case - buffer overflows - or was it
a safer language in general ? For example, how strong was type
checking in this safer C ?
Post by bill
And, on another note regarding C and Ada. The original GNAT
compiler converted Ada into C and compiled it with GCC. Now,
it seems to me that points at two possible concepts. One is
that if Ada can be done in C then it has all the same flaws
and warts. Not sure I would like to go in that direction.
The other is much more interesting. And that is the concept
that C can, obviously, be just as safe as Ada. The question
then becomes why isn't it? See my first paragraph. :-)
Well, that's a load of nonsense and shows a total lack of understanding
of how compilers work. All compiled languages are ultimately compiled
into assembly language opcodes. That doesn't mean they are only as safe
as the assembly language they are compiled into.

OTOH, it could sound like the reasoning of someone trying to desperately
claim that C is somehow as safe as Ada. :-)

Also, how long did this GNAT compiler that translated into C
actually exist for ? Was it something that once existed for a couple
of years about 30-35 years ago and was never used again.

I first started really using Ada compilers around the gcc 2.8 timeframe
(IIRC) and have never encountered this Ada to C translator you speak of.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
bill
2024-03-08 14:38:21 UTC
Permalink
Post by Simon Clubley
Post by bill
But my argument is that C had the chance to be one of those
"safer" languages. Users rejected it. Have to wonder why.
Did it fix only one special case - buffer overflows - or was it
a safer language in general ? For example, how strong was type
checking in this safer C ?
Misuse of functions
mismatched parameters
array indexing/out of bounds
stray pointers
Arithmetic errors/division by 0/overflow
misuse of standard I/O
misuse of string functions

Don't know what else it might have done as all I have are
the descriptions in the Software Sourcebooks.

As an interesting side note, this was not only available for
all the PDP-11 OSes it was also available for Ultrix-32 and
VMS.
Post by Simon Clubley
Post by bill
And, on another note regarding C and Ada. The original GNAT
compiler converted Ada into C and compiled it with GCC. Now,
it seems to me that points at two possible concepts. One is
that if Ada can be done in C then it has all the same flaws
and warts. Not sure I would like to go in that direction.
The other is much more interesting. And that is the concept
that C can, obviously, be just as safe as Ada. The question
then becomes why isn't it? See my first paragraph. :-)
Well, that's a load of nonsense and shows a total lack of understanding
of how compilers work. All compiled languages are ultimately compiled
into assembly language opcodes. That doesn't mean they are only as safe
as the assembly language they are compiled into.
Well, I did discount explanation 1. :-)

But, explanation 2 still stands. If the "safe" code written in
Ada can be converted to C then, obviously, the same "safe" code
could be written directly using C. The question really is why do
programmers choose not to.
Post by Simon Clubley
OTOH, it could sound like the reasoning of someone trying to desperately
claim that C is somehow as safe as Ada. :-)
It is. It is not any shortcoming in the language that makes C
"unsafe". It is the practices of the programmers.
Post by Simon Clubley
Also, how long did this GNAT compiler that translated into C
actually exist for ? Was it something that once existed for a couple
of years about 30-35 years ago and was never used again.
Really don't remember. That was more than a lifetime ago in
computer years. :-)
Post by Simon Clubley
I first started really using Ada compilers around the gcc 2.8 timeframe
(IIRC) and have never encountered this Ada to C translator you speak of.
A lot of the early Gnu compilers started as translations to C
and compilation with GCC. P2C, F2C As has been stated many times,
C is really just a slightly higher level than assembler.
You know, one can easily write buffer overflows, out of bounds arrays,
type mismatches, etc. with assembler but no one blames the assembler
for it.

bill
Arne Vajhøj
2024-03-08 15:15:24 UTC
Permalink
Post by Simon Clubley
Post by bill
And, on another note regarding C and Ada.  The original GNAT
compiler converted Ada into C and compiled it with GCC.  Now,
it seems to me that points at two possible concepts.  One is
that if Ada can be done in C then it has all the same flaws
and warts.  Not sure I would like to go  in that direction.
The other is much more interesting.  And that is the concept
that C can, obviously, be just as safe as Ada.  The question
then becomes why isn't it?  See my first paragraph.   :-)
Well, that's a load of nonsense and shows a total lack of understanding
of how compilers work. All compiled languages are ultimately compiled
into assembly language opcodes. That doesn't mean they are only as safe
as the assembly language they are compiled into.
Well, I did discount explanation 1.  :-)
But, explanation 2 still stands.  If the "safe" code written in
Ada can be converted to C then, obviously, the same "safe" code
could be written directly using C.  The question really is why do
programmers choose  not to.
Programmers are human. They try their best but they make mistakes.

If a large number of programmers write a huge application, then
there will be big number of mistakes made. Inevitable.

And this is where the language comes in:

mistakes causing compile time error => mistakes get fixed during development

mistakes causing runtime error => mistakes get fixed during development
if found in test *or* result in unavailability of functionality in
production if not found in test

mistakes causing undefined behavior => mistakes get fixed during
development if found in test *or* result in unavailability of
functionality or data corruption or data leak or combination in
production if not found in test
Post by Simon Clubley
OTOH, it could sound like the reasoning of someone trying to desperately
claim that C is somehow as safe as Ada. :-)
It is.  It is not any shortcoming in the language that makes C
"unsafe".  It is the practices of the programmers.
The definition of a safe language is not a language that allows
safe code - the definition of a safe language is a language that
enforces safe code.

Arne
Dan Cross
2024-03-08 15:36:17 UTC
Permalink
[snip]
But, explanation 2 still stands. If the "safe" code written in
Ada can be converted to C then, obviously, the same "safe" code
could be written directly using C. The question really is why do
programmers choose not to.
Not really. The generated "safe" code need not be as safe as
the original source Ada. For example, if this Ada compiler that
generates C does array bounds checking, and can statically
verify that an array access is within bounds, it need not
insert that check in the generated C code. Similarly with all
sorts of things; verifying alignment, weird type casts, etc.
The Ada compiler can generate spaghetti C that is unreadable to
a human and it doesn't matter, because it's just an intermediate
representation.
Post by Simon Clubley
OTOH, it could sound like the reasoning of someone trying to desperately
claim that C is somehow as safe as Ada. :-)
It is. It is not any shortcoming in the language that makes C
"unsafe". It is the practices of the programmers.
This is reductio ad absurdum. There's a lot in the language not
to be liked. Quick, is the following always well-defined?

uint16_t
mul(uint16_t a, uint16_t b)
{
return a * b;
}

It's super easy to trivially fall over UB in C.
Post by Simon Clubley
Also, how long did this GNAT compiler that translated into C
actually exist for ? Was it something that once existed for a couple
of years about 30-35 years ago and was never used again.
Really don't remember. That was more than a lifetime ago in
computer years. :-)
Post by Simon Clubley
I first started really using Ada compilers around the gcc 2.8 timeframe
(IIRC) and have never encountered this Ada to C translator you speak of.
A lot of the early Gnu compilers started as translations to C
and compilation with GCC. P2C, F2C As has been stated many times,
C is really just a slightly higher level than assembler.
You know, one can easily write buffer overflows, out of bounds arrays,
type mismatches, etc. with assembler but no one blames the assembler
for it.
Sure they do. There's a reason that, these days, assembler is
mostly a _target_ and not a source language. There are
exceptions of course, but these usually fall into the domain of
either legacy code (z/Arch assembler, MACRO-32) or specialized
use (e.g., the supervisor instruction set in an OS). Most
programs these days are written in higher level languages
because we know that a) programming in assembler is often
tedious and b) it is error-prone.

- Dan C.
Scott Dorsey
2024-03-08 00:48:47 UTC
Permalink
Post by bill
Hmmm... I don't see Jovial on that list. Are they going to
try and force the Air Force to use Ada again?
All the stuff we used to do in Jovial and in Hal/S, we do in C now.
It's definitely a step down. Ada is a better choice for realtime
stuff but it doesn't compile down very compactly. I like the coroutines.
--scott
--
"C'est un Nagra. C'est suisse, et tres, tres precis."
Loading...