Discussion:
strlcpy missing on Linux
(too old to reply)
Thomas Maier-Komor
2006-04-07 10:31:02 UTC
Permalink
Hi,

does anybody know if there is any specific reason, why Linux's glibc is
missing strlcpy?

TIA,
Tom
Nils O. Selåsdal
2006-04-07 11:04:24 UTC
Permalink
Post by Thomas Maier-Komor
Hi,
does anybody know if there is any specific reason, why Linux's glibc is
missing strlcpy?
Follow the threads at
http://sources.redhat.com/ml/libc-alpha/2000-08/msg00053.html
http://sources.redhat.com/ml/libc-alpha/2002-01/msg00001.html
Thomas Maier-Komor
2006-04-07 11:25:30 UTC
Permalink
Post by Nils O. Selåsdal
Post by Thomas Maier-Komor
Hi,
does anybody know if there is any specific reason, why Linux's glibc is
missing strlcpy?
Follow the threads at
http://sources.redhat.com/ml/libc-alpha/2000-08/msg00053.html
http://sources.redhat.com/ml/libc-alpha/2002-01/msg00001.html
Thanks. That was a little bit enlightening. I didn't read everything,
but I can say that I totally disagree with the opinion that a program
always has to know how long its strings are.

Reason:
Reading from standard input, you will never be able to determine how
much data you will get. In this case strlcpy can ease the implementation
because it is a) safe, b) tells you when it runs out of memory, c) says
how much has been copied. So it is really convenient to implement an
exception mechanism that supports starting with a small buffer and
increasing it in size when necessary.

But I think this isn't the first time that Linux or glibc developers
know better how to solve problems that they don't have. Very sad.

Tom
Nils O. Selåsdal
2006-04-07 11:32:18 UTC
Permalink
Post by Thomas Maier-Komor
Post by Nils O. Selåsdal
Post by Thomas Maier-Komor
Hi,
does anybody know if there is any specific reason, why Linux's glibc is
missing strlcpy?
Follow the threads at
http://sources.redhat.com/ml/libc-alpha/2000-08/msg00053.html
http://sources.redhat.com/ml/libc-alpha/2002-01/msg00001.html
Thanks. That was a little bit enlightening. I didn't read everything,
but I can say that I totally disagree with the opinion that a program
always has to know how long its strings are.
Reading from standard input, you will never be able to determine how
much data you will get. In this case strlcpy can ease the implementation
because it is a) safe, b) tells you when it runs out of memory, c) says
how much has been copied. So it is really convenient to implement an
exception mechanism that supports starting with a small buffer and
increasing it in size when necessary.
If you're reading input, you know how much input you did read, and how
how much you have read. I don't see why strlcpy helps in that regard.

I also don't like randomly chopping off strings, which is what strlcpy
will do for you. It's better than the alternative of overflowing buffers
but imo both cases show you're sloppy in what you accept and validate
for your input.
Thomas Maier-Komor
2006-04-07 12:08:44 UTC
Permalink
Post by Nils O. Selåsdal
If you're reading input, you know how much input you did read, and how
how much you have read. I don't see why strlcpy helps in that regard.
But if you tokenize your input and don't care about the length of the
tokens, it often doesn't really make sense passing around the size of
the strings handled. I agree that you know the lengths most of the time,
but if buffer handling is the only reason passing these values around
and keeping them up-to-date, it is often easier to discard them and
calculate them if needed.
Post by Nils O. Selåsdal
I also don't like randomly chopping off strings, which is what strlcpy
will do for you. It's better than the alternative of overflowing buffers
but imo both cases show you're sloppy in what you accept and validate
for your input.
----------
$ uname -a
SunOS anduril 5.10 Generic_118833-03 sun4u sparc SUNW,Sun-Blade-1000
$ man strlcpy
The strlcpy() function copies at most dstsize-1 characters
(dstsize being the size of the string buffer dst) from src
to dst, truncating src if necessary. The result is always
null-terminated. The function returns strlen(src). Buffer
overflow can be checked as follows:

if (strlcpy(dst, src, dstsize) >= dstsize)
return -1;
--------

So strlcpy returns the size of the input string and it gives you the
opportunity to resize the buffer if needed. So no data will be lost, and
here is nothing sloppy...

E.g.:
char *buffer; /* your buffer */
size_t bufsize; /* current size of the buffer */

while (strlcpy(buffer,src,bufsize) >= bufsize) {
buffer = realloc(buffer, bufsize += getpagesize());
assert(buffer);
}

This way of implementation is especially useful, if you can guess a
buffersize that is big enough for 99% of all cases. The missing cases
then can easily be handled like shown above. And don't forget that you
will need a struct or pointers to return a token from a function that
consists of char * and size_t. Using only char * can make things easier.

But this is probably also a matter of style.

Cheers,
Tom
Casper H.S. Dik
2006-04-07 13:24:30 UTC
Permalink
Post by Nils O. Selåsdal
I also don't like randomly chopping off strings, which is what strlcpy
will do for you. It's better than the alternative of overflowing buffers
but imo both cases show you're sloppy in what you accept and validate
for your input.
Since strlcpy() also tells it did do some chopping, such error
conditions are easily verified.

Memcpy() leads to a mess which is hard to verify as being correct;
strlcat() and strlcpy() allow easy coding with easy error handling.

Snprintf(), strl*() return the amount of space actually needed; *not*
so you can just discard the remainder but so that you can bail out
when it returns a value larger than the buffer size.

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
James Antill
2006-04-09 03:36:35 UTC
Permalink
Post by Thomas Maier-Komor
Thanks. That was a little bit enlightening. I didn't read everything,
but I can say that I totally disagree with the opinion that a program
always has to know how long its strings are.
That's nice, here's an overflow as your present ... I know, I know
everyone else got you one too. Ahh well.
Post by Thomas Maier-Komor
Reading from standard input, you will never be able to determine how
much data you will get.
So you always allocate the largest amount of memory possible when reading
from stdin?
Post by Thomas Maier-Komor
In this case strlcpy can ease the implementation
because it is a) safe, b) tells you when it runs out of memory, c) says
how much has been copied. So it is really convenient to implement an
exception mechanism that supports starting with a small buffer and
increasing it in size when necessary.
Ahh, so you implement half a worthwhile string API using strlcpy() as a
base. Always a good choice to reinvent something pretty much everyone's
screwed reinventing ... have fun with that.
Post by Thomas Maier-Komor
But I think this isn't the first time that Linux or glibc developers
know better how to solve problems that they don't have. Very sad.
And you're not the only one who thinks std. C-string functions are "safe
enough, if you are just perfect enough when using them". Very sad.

http://www.and.org/vstr/security#libcstring
--
James Antill -- ***@and.org
http://www.and.org/and-httpd/
Thomas Maier-Komor
2006-04-09 12:51:05 UTC
Permalink
Post by James Antill
Post by Thomas Maier-Komor
Thanks. That was a little bit enlightening. I didn't read everything,
but I can say that I totally disagree with the opinion that a program
always has to know how long its strings are.
That's nice, here's an overflow as your present ... I know, I know
everyone else got you one too. Ahh well.
If you know the maximum size of your strings and act accordingly, you
won't get any overflows. Please reread my posting and don't cut away the
important parts.
Post by James Antill
Post by Thomas Maier-Komor
Reading from standard input, you will never be able to determine how
much data you will get.
So you always allocate the largest amount of memory possible when reading
from stdin?
Of course not. If you don't know how much you will get, you always have
to allocate a block and see if it is enough. If it isn't you must
realloc and then continue reading. There are of course also other valid
approaches.
Post by James Antill
Post by Thomas Maier-Komor
In this case strlcpy can ease the implementation
because it is a) safe, b) tells you when it runs out of memory, c) says
how much has been copied. So it is really convenient to implement an
exception mechanism that supports starting with a small buffer and
increasing it in size when necessary.
Ahh, so you implement half a worthwhile string API using strlcpy() as a
base. Always a good choice to reinvent something pretty much everyone's
screwed reinventing ... have fun with that.
I didn't invent strlcpy. It has been there since many years in the
*BSDs. Please read the discussion about strlcpy yourself, the comment of
Casper Dik in this thread, and my comment how strlcpy can safely handle
certain situations and ease implementation at the same time.
Post by James Antill
Post by Thomas Maier-Komor
But I think this isn't the first time that Linux or glibc developers
know better how to solve problems that they don't have. Very sad.
And you're not the only one who thinks std. C-string functions are "safe
enough, if you are just perfect enough when using them". Very sad.
http://www.and.org/vstr/security#libcstring
I know about security. You don't have to point me at something I am
really aware of. If you had read my posting more closely and not only
objected to me complaining that Linux is missing something, then you
would have seen that I _do_ care about security.

BTW: security is only really important if you have to handle untrusted
data. My observation came up when I wrote a program that handles user
input. Additionally, there are many applications that must be written in
C and cannot employ C++ std::string, an arbitrary third party string
handling library or something similar. C++ std::string is obviously
superior, but if you must not link against the C++ library, this way is
obviously not an option.

Loading...