In article <3xs_g.22687$***@dukeread09>,
Arne Vajhøj <***@vajhoej.dk> writes:
> Bill Gunshannon wrote:
>> In article <wbf_g.22664$***@dukeread09>,
>> Arne Vajhøj <***@vajhoej.dk> writes:
>>> Bill Gunshannon wrote:
>>>> Actually, C doesn't care about null terminated strings any more than
>>>> any other language. You are free to create arrays of char that contain
>>>> strings in any format you prefer. All you have to do is write routines
>>>> to deal with them.
>>> Not quite true.
>>>
>>> The compiler itself generate null terminated char arrays for
>>> string literals.
>>
>> And, as I stated later on, so does the Ada compiler.
>
> Yep, but that was wrong too.
OK, if you say so. But all the examples I had here (we used to use
Ada and the gnat compiler as the basic beginner programming language)
come have ASCII characters with at least one null at the end. Ada
may not use that null for anything, but it's still there.
>
>> It is not the
>> composition of the character array that is the problem, it is the
>> handling of it.
>
> To make it safe there should be a length.
True.
>
> And if there is a length, then the null bytes is not needed.
Also true. So, why does the Ada compiler put them there? Probably just
wasting space as it does in may other ways as well.
>
>>>> And for all of you who feel much more comfortable with Ada, now that we
>>>> know there is no longer a DEC Ada and GNAT is the future, try writting a
>>>> program with a few strings in it, compile it with GNAT and then pass the
>>>> executable thru the Unix "strings" command. For those of you not familiar
>>>> with it, it will find and print out the strings in a binary file. It does
>>>> this by looking for strings of ASCII that end in a null. It usually returns
>>>> a lot of bogus strings cause lots of null terminated ASCII happens by pure
>>>> chance, but it also returns all the real strings. Why do I mention this?
>>>> Because it will print out all the strings in your Ada program. Anybody
>>>> want to take a guess at what that means?
>>> The format is not a C string.
>>>
>>> The format appears to be (based on a hex dump):
>>>
>>> * N bytes
>>> * 0-7 nul bytes padding length to multipla of 8
>>> * 4 bytes with 1
>>> * 4 bytes with N
>>
>> Sure looks like one to the strings command. :-)
>
> Probably because the 1 bytes is not printable.
After it has seen the first null, the rest means nothing until it finds
the next thing that looks like a string. Four 1's and four byres of
relative garbage would not trigger this. If I can find a machine with
GNAT still on it I will have to try a few tests to see if I can get it
to create a string with no null at the end. (Just curious!)
>
> > As I stated, there is
>> nothing in the world stopping people from changing that constitutes a
>> "string" in C.
>
> Ofcourse there are.
>
> The C standard section 6.4.5 defines how a string literal
> is stored as bytes.
OK, and who wrote that definition? And why can't they re-define it
otherwise? And, if the null terminated string was such a problem.
why didn't they redefine it in the first place?
>
>> You can easily use the UCSD definition and make the first
>> byte (or more, depending on the size limitation you want to impose) a
>> length indicator and then write replacements for the standard string
>> manipulation routines to deal with them. Or use the VMS Descriptor
>> concept with the same requirement for new routines. The source to
>> numerous C compilers are available. One could easily make all of this
>> totally transparent to the developer. Why has it not been done? Apparently,
>> not enough people really care.
>
> It would no longer be C.
If the ANSI C committee defines it that way, it would most certainly be
C. (Not my opinion, but when I complain to people about ANSI C not being
C because it diverged from K&R I am always told the above!)
>
> And if you are making a new language, then you can just as easily
> fix a few other things.
True, there are already languages that don't have this problem, like PL/1.
(Hi Tom! :-)
But if the ANSI C Committee makes the change, it would still be C by
virtue of their blessing. So, why has this not been fixed? Comes
right back to my statement about interest. Obviously, not enough
people care to make the move. Heck, it would even be possible to do
it in such a manner that you could maintain complete compatability
with existing source code if you really wanted to. After all, the
data type "string" doesn't currently exist in C. The ANSI C Committee
could add it. And the language would still be C in the opinions of
the majority of the world.
>
> And that has been done.
>
> None of the big languages invented after 1990 (primarily Java and C#)
> uses the null terminated string concept. They use objects. Which
> are in reality equivalent to VMS descriptors.
Take som time to search for some of the recent comments from experts in
the OO field. Even they are starting to realize the emperor has no clothes.
(Not saying VMS Descriptors are wrong, because I think they are a good
idea, just pointing out that even the OO community is beginning to
come to the realization that OO is no more the universal answer than
any other pardigm we have come up with inthe industry in the past.)
bill
--
Bill Gunshannon | de-moc-ra-cy (di mok' ra see) n. Three wolves
***@cs.scranton.edu | and a sheep voting on what's for dinner.
University of Scranton |
Scranton, Pennsylvania | #include <std.disclaimer.h>