Discussion:
clarification please.
(too old to reply)
G G
2020-08-04 07:40:04 UTC
Permalink
what does the author mean? could you please say this
a different way. may an example or two :-)
-------

from Operating System Concepts, 10 edition, page 135.

Ports are finite in size and unidirectional; for two-way communication, a message is sent to one port, and a response is sent to a separate reply port. Each port may have multiple senders, but only one receiver.
Jorgen Grahn
2020-08-04 14:01:29 UTC
Permalink
Post by G G
what does the author mean? could you please say this
a different way. may an example or two :-)
-------
from Operating System Concepts, 10 edition, page 135.
By Silberschatz, Galvin and Gagne?
Post by G G
Ports are finite in size and unidirectional; for two-way
communication, a message is sent to one port, and a response is sent
to a separate reply port. Each port may have multiple senders, but
only one receiver.
Do these ports map to something in Unix? It doesn't look like they
describe anything I've heard about.

If it's TCP or UDP ports, it's a really confusing attempt to explain
them.

/Jorgen
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Rainer Weikusat
2020-08-04 14:13:11 UTC
Permalink
Post by Jorgen Grahn
Post by G G
what does the author mean? could you please say this
a different way. may an example or two :-)
-------
from Operating System Concepts, 10 edition, page 135.
By Silberschatz, Galvin and Gagne?
Post by G G
Ports are finite in size and unidirectional; for two-way
communication, a message is sent to one port, and a response is sent
to a separate reply port. Each port may have multiple senders, but
only one receiver.
Do these ports map to something in Unix? It doesn't look like they
describe anything I've heard about.
If it's TCP or UDP ports, it's a really confusing attempt to explain
them.
This refers to Mach IPC which is based on an abstraction called "port".
b***@nowhere.co.uk
2020-08-04 15:26:39 UTC
Permalink
On Tue, 04 Aug 2020 15:13:11 +0100
Post by Rainer Weikusat
Post by Jorgen Grahn
Post by G G
what does the author mean? could you please say this
a different way. may an example or two :-)
-------
from Operating System Concepts, 10 edition, page 135.
By Silberschatz, Galvin and Gagne?
Post by G G
Ports are finite in size and unidirectional; for two-way
communication, a message is sent to one port, and a response is sent
to a separate reply port. Each port may have multiple senders, but
only one receiver.
Do these ports map to something in Unix? It doesn't look like they
describe anything I've heard about.
If it's TCP or UDP ports, it's a really confusing attempt to explain
them.
This refers to Mach IPC which is based on an abstraction called "port".
These confused the hell out of me when I first did a bit of MacOS systems
programming. The word "port" just gets chucked about the documentation
without explaining what it is.
Scott Lurndal
2020-08-04 14:58:54 UTC
Permalink
Post by Jorgen Grahn
Post by G G
what does the author mean? could you please say this
a different way. may an example or two :-)
-------
from Operating System Concepts, 10 edition, page 135.
By Silberschatz, Galvin and Gagne?
Post by G G
Ports are finite in size and unidirectional; for two-way
communication, a message is sent to one port, and a response is sent
to a separate reply port. Each port may have multiple senders, but
only one receiver.
Do these ports map to something in Unix? It doesn't look like they
describe anything I've heard about.
It does sort of sound like a pipe, albeit there's nothing preventing
a pipe from having multiple receivers, although confusion may ensue
were that to be the case.
G G
2020-08-04 15:38:21 UTC
Permalink
Post by Jorgen Grahn
Post by G G
what does the author mean? could you please say this
a different way. may an example or two :-)
-------
from Operating System Concepts, 10 edition, page 135.
By Silberschatz, Galvin and Gagne?
yes, sorry i should have included the authors
Post by Jorgen Grahn
Post by G G
Ports are finite in size and unidirectional; for two-way
communication, a message is sent to one port, and a response is sent
to a separate reply port. Each port may have multiple senders, but
only one receiver.
Do these ports map to something in Unix? It doesn't look like they
describe anything I've heard about.
the author is describing Mach Message passing in a section
that includes the following.

"The Mach kernel supports the creation and destruction of
multiple tasks, which are similar to processes but have
multiple threads of control and fewer associated resources.
Most communication in Mach — including all inter-task
communication—is carried out by messages. Messages are sent
to, and received from, mailboxes, which are called ports in
Mach. Ports are finite in size and unidirectional; for two-way
communication, a message is sent to one port, and a response
is sent to a separate reply port. Each port may have multiple
senders, but only one receiver. Mach uses ports to represent
resources such as tasks, threads, memory, and processors, while
message passing provides an object-oriented approach for
interacting with these system resources and services. Message
passing may occur between any two ports on the same host or on
separate hosts on a distributed system." [1]

[1] Operating System Concepts, 10 Edition,
Silberschatz, Galvin, Gagne
p. 135

i hope it's a good textbook, all my textbooks are old.

thanks Jorgen
Joe Pfeiffer
2020-08-04 16:24:14 UTC
Permalink
Post by G G
what does the author mean? could you please say this
a different way. may an example or two :-)
from Operating System Concepts, 10 edition, page 135.
Wow, this is terse! Silbershatz (sp?) et al normally has much longer
and more readable descriptions of stuff (that's why it's one of the
classic OS textbooks). Anyway,
Post by G G
Ports are finite in size
When you are using ports to communicate from one process to another, you
can't just keep shoving data into the port forever. Eventually the
other process has to read the data, or the sending process will block
(if that happens, then once the receiver does read some bytes the sender
can proceed).
Post by G G
and unidirectional;
A port has a sender (well, see below) and a receiver. The receiver
can't transmit using the same port it reads from.
Post by G G
for two-way communication, a message is sent to one port, and a
response is sent to a separate reply port.
If two processes are going to communicate back and forth (rather than
just one being a sender and the other a receiver), you need two ports:
one to get data from A to B, and the other to get data from B to A.
Post by G G
Each port may have multiple senders, but only one receiver.
This is my "see below" from the note on "unidirectional" above -- you
can actually have several processes dumping data into the port, but you
can only have one receiving that data.

Ports are a very common abstraction for interprocess communication, but
there are subtle (and not-so-subtle) differences between ports as used
by different OSes, and even different types of ports on a single OS.

Hope this helps
Kaz Kylheku
2020-08-04 17:55:20 UTC
Permalink
Post by G G
what does the author mean? could you please say this
a different way. may an example or two :-)
-------
from Operating System Concepts, 10 edition, page 135.
Ports are finite in size and unidirectional; for two-way
Ports are finite in size probably means capacity. They hold a fixed
number of messages or bytes or whatever. If an operation is attempted
to send on a port that is full, it has to be handled somehow, like
by blocking the sender until room is available.

A port being unidirectional means that it has a single FIFO. Messages
are placed on one end and removed from the other. The related signaling
is also assymetric. The port can signal to a receiver that a message is
available, and to a sender that room is available.

A unidirectional port *can* be used bi-directionally, but only in some
limited or inefficient ways. If the right kind of signaling is
available, two communicating peers can use a unidirectional port
in half-duplex mode, by taking turns.

In a system design, a port object may actually be accessed through proxy
objects called handles. A pair of such handles is created for a port,
and the handles can have separate "sender" and "receiver"
configurations. Only receive operations are allowed on the receive
handle, and only send operations on the send handle. These handles then
look like the "ends" of a one-way pipe. A sender process (or one or more
senders) holds the sending handle. The receiver holds the receiving
handle.

In Unix, an example of a unidirectional port with two handles that have
a dedicated I/O direction is the pipe. A pipe is created using the
pipe() library function, which produces a pair of file descriptors: one
for reading and one for writing. They are connected to opposite ends of
the pipe, so to speak. A Unix pipe is also finite in size: if a writer
tries to place more than a certain amount of bytes into a pipe, the
operation will block. Unix pipes can also be specified in the
filesystem as special directory entries called FIFO objects. Readers and
writers rendezvous by opening the FIFO for reading or writing; the
effect is the same as obtaining a pipe using the pipe() function.

Unix pipes carry bytes, not messages. An application that wants to
send discrete messages can do one of two things: (1) choose a message
representation such that all messages are one byte wide or (2) implement
message framing in the byte stream (for instance, length followed by
payload).

A bi-directional port object usually has two handles attached to it
which have "crossed circuit wiring". Let's call them handle A and B.
Both A and B are objects of the same kind: they can perform sends and
receives. The behavior is that material sent on A is received on B, and
vice versa. The underlying port has a pair of unidirectional queues,
and each of the two handles knows which one it is using for sending and
which for receiving.

In Unix, an example of this is the connected socket pair. A socket pair
can be created with the socketpair function, which resembles pipe() in
that it produces a pair of file descriptors. Whatever is written on one
can be read from the other and vice versa. The socket pair supports
message delimiting, because the socketpair function has a type argument
which allows the caller to request a datagram socket (SOCK_DGRAM)
instead of a byte stream socket (SOCK_STREAM).
b***@nowhere.co.uk
2020-08-05 08:54:05 UTC
Permalink
On Tue, 4 Aug 2020 17:55:20 +0000 (UTC)
Post by Kaz Kylheku
In Unix, an example of a unidirectional port with two handles that have
a dedicated I/O direction is the pipe. A pipe is created using the
pipe() library function, which produces a pair of file descriptors: one
for reading and one for writing. They are connected to opposite ends of
the pipe, so to speak. A Unix pipe is also finite in size: if a writer
tries to place more than a certain amount of bytes into a pipe, the
operation will block. Unix pipes can also be specified in the
Not only that - pipes (on linux at least) don't guarantee to send even small
messages in one go. I found this out the hard way a few years back when
using pipes between processes and getting occasional obscure errors. It
turned out that despite the file descriptor mask bit being set in select() the
pipe did not contain the entire message at that point and so only part was
being read by the receiver. There is really no excuse for this in internal
kernel systems.
Jorgen Grahn
2020-08-05 11:29:43 UTC
Permalink
Post by b***@nowhere.co.uk
On Tue, 4 Aug 2020 17:55:20 +0000 (UTC)
Post by Kaz Kylheku
In Unix, an example of a unidirectional port with two handles that have
a dedicated I/O direction is the pipe. A pipe is created using the
pipe() library function, which produces a pair of file descriptors: one
for reading and one for writing. They are connected to opposite ends of
the pipe, so to speak. A Unix pipe is also finite in size: if a writer
tries to place more than a certain amount of bytes into a pipe, the
operation will block. Unix pipes can also be specified in the
Not only that - pipes (on linux at least) don't guarantee to send even small
messages in one go. I found this out the hard way a few years back when
using pipes between processes and getting occasional obscure errors. It
turned out that despite the file descriptor mask bit being set in select() the
pipe did not contain the entire message at that point and so only part was
being read by the receiver. There is really no excuse for this in internal
kernel systems.
A pipe is a stream of bytes like e.g. a TCP socket -- you can't safely
define your stream of application-level messages on top of that, and
expect to be notified by select() about those.

Although given this text in Linux pipe(7)

POSIX.1-2001 says that write(2)s of less than PIPE_BUF bytes must
be atomic: the output data is written to the pipe as a contiguous
sequence. Writes of more than PIPE_BUF bytes may be nonatomic:
the kernel may interleave the data with data written by other
processes. POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)

I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.

/Jorgen
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
b***@nowhere.co.uk
2020-08-05 15:43:52 UTC
Permalink
On 5 Aug 2020 11:29:43 GMT
Post by Jorgen Grahn
Post by b***@nowhere.co.uk
On Tue, 4 Aug 2020 17:55:20 +0000 (UTC)
Post by Kaz Kylheku
In Unix, an example of a unidirectional port with two handles that have
a dedicated I/O direction is the pipe. A pipe is created using the
pipe() library function, which produces a pair of file descriptors: one
for reading and one for writing. They are connected to opposite ends of
the pipe, so to speak. A Unix pipe is also finite in size: if a writer
tries to place more than a certain amount of bytes into a pipe, the
operation will block. Unix pipes can also be specified in the
Not only that - pipes (on linux at least) don't guarantee to send even small
messages in one go. I found this out the hard way a few years back when
using pipes between processes and getting occasional obscure errors. It
turned out that despite the file descriptor mask bit being set in select()
the
Post by b***@nowhere.co.uk
pipe did not contain the entire message at that point and so only part was
being read by the receiver. There is really no excuse for this in internal
kernel systems.
A pipe is a stream of bytes like e.g. a TCP socket -- you can't safely
define your stream of application-level messages on top of that, and
expect to be notified by select() about those.
Although given this text in Linux pipe(7)
POSIX.1-2001 says that write(2)s of less than PIPE_BUF bytes must
be atomic: the output data is written to the pipe as a contiguous
the kernel may interleave the data with data written by other
processes. POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)
I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.
Not much chance of that. The messages I was sending were only a few 10s of
bytes each and it couldn't even keep them in one piece. The only solution was
to add a length field to each message and the receiver had to wait until the
whole message had been received. It was a right PITA.
James K. Lowden
2020-08-05 23:52:07 UTC
Permalink
On Wed, 5 Aug 2020 15:43:52 +0000 (UTC)
Post by b***@nowhere.co.uk
Post by Jorgen Grahn
Although given this text in Linux pipe(7)
POSIX.1-2001 says that write(2)s of less than PIPE_BUF bytes must
be atomic: the output data is written to the pipe as a contiguous
the kernel may interleave the data with data written by other
processes. POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)
I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.
Not much chance of that. The messages I was sending were only a few
10s of bytes each and it couldn't even keep them in one piece. The
only solution was to add a length field to each message and the
receiver had to wait until the whole message had been received. It
was a right PITA.
I have to say I'm surprised. I thought "atomic" meant the call doesn't
return until the message is delivered to the pipe, and that read(2)
would wait for write(2) to finish.

That doesn't mean there are message boundaries. If you write 10 bytes
twice, and read 12, I'd expect you'd get 12 bytes. Message boundaries
are your problem in a streaming protocol.

But, according to your description, "atomic" means even less than I
thought. It only means that two writes won't interfere with each
other: two "simulataneous" atomic writes will appear in the pipe as if
they happened sequentially (because, I suppose, they *did* happen
sequentially). It says nothing about read.

I switched to Posix message queue when local, and SOCK_SEQPACKET when
not. I *think* that means clear sailing.

--jkl
b***@nowhere.co.uk
2020-08-06 07:43:03 UTC
Permalink
On Wed, 5 Aug 2020 19:52:07 -0400
Post by James K. Lowden
On Wed, 5 Aug 2020 15:43:52 +0000 (UTC)
Post by b***@nowhere.co.uk
Not much chance of that. The messages I was sending were only a few
10s of bytes each and it couldn't even keep them in one piece. The
only solution was to add a length field to each message and the
receiver had to wait until the whole message had been received. It
was a right PITA.
I have to say I'm surprised. I thought "atomic" meant the call doesn't
return until the message is delivered to the pipe, and that read(2)
would wait for write(2) to finish.
It only happened very occasionally which is why I didn't pick up on it
straight away. But it shouldn't happen at all IMO.
Joe Pfeiffer
2020-08-06 17:19:45 UTC
Permalink
Post by b***@nowhere.co.uk
On Wed, 5 Aug 2020 19:52:07 -0400
Post by James K. Lowden
On Wed, 5 Aug 2020 15:43:52 +0000 (UTC)
Post by b***@nowhere.co.uk
Not much chance of that. The messages I was sending were only a few
10s of bytes each and it couldn't even keep them in one piece. The
only solution was to add a length field to each message and the
receiver had to wait until the whole message had been received. It
was a right PITA.
I have to say I'm surprised. I thought "atomic" meant the call doesn't
return until the message is delivered to the pipe, and that read(2)
would wait for write(2) to finish.
It only happened very occasionally which is why I didn't pick up on it
straight away. But it shouldn't happen at all IMO.
You're working with bytestreams, not datagrams. Most of the time, for
small messages, you can forget that -- but every so often it will step
up and bite (byte?) you.
Rainer Weikusat
2020-08-06 15:34:24 UTC
Permalink
Post by James K. Lowden
On Wed, 5 Aug 2020 15:43:52 +0000 (UTC)
Post by b***@nowhere.co.uk
Post by Jorgen Grahn
Although given this text in Linux pipe(7)
POSIX.1-2001 says that write(2)s of less than PIPE_BUF bytes must
be atomic: the output data is written to the pipe as a contiguous
the kernel may interleave the data with data written by other
processes. POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)
I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.
Not much chance of that. The messages I was sending were only a few
10s of bytes each and it couldn't even keep them in one piece. The
only solution was to add a length field to each message and the
receiver had to wait until the whole message had been received. It
was a right PITA.
I have to say I'm surprised. I thought "atomic" meant the call doesn't
return until the message is delivered to the pipe, and that read(2)
would wait for write(2) to finish.
At least for Linux, a read started before an earlier write completed
will wait until the write is either complete or the pipe full (all
serialized wrt to each other based on a mutex associated with the
inode).

[...]
Post by James K. Lowden
I switched to Posix message queue when local, and SOCK_SEQPACKET when
not. I *think* that means clear sailing.
Locally, you could also use AF_UNIX SOCK_DGRAM sockets (if getting an
EOF on writer disconnect/ close isn't necesary).
Kaz Kylheku
2020-08-06 09:51:37 UTC
Permalink
Post by Jorgen Grahn
Post by b***@nowhere.co.uk
On Tue, 4 Aug 2020 17:55:20 +0000 (UTC)
Post by Kaz Kylheku
In Unix, an example of a unidirectional port with two handles that have
a dedicated I/O direction is the pipe. A pipe is created using the
pipe() library function, which produces a pair of file descriptors: one
for reading and one for writing. They are connected to opposite ends of
the pipe, so to speak. A Unix pipe is also finite in size: if a writer
tries to place more than a certain amount of bytes into a pipe, the
operation will block. Unix pipes can also be specified in the
Not only that - pipes (on linux at least) don't guarantee to send even small
messages in one go. I found this out the hard way a few years back when
using pipes between processes and getting occasional obscure errors. It
turned out that despite the file descriptor mask bit being set in select() the
pipe did not contain the entire message at that point and so only part was
being read by the receiver. There is really no excuse for this in internal
kernel systems.
A pipe is a stream of bytes like e.g. a TCP socket -- you can't safely
define your stream of application-level messages on top of that, and
expect to be notified by select() about those.
Although given this text in Linux pipe(7)
POSIX.1-2001 says that write(2)s of less than PIPE_BUF bytes must
be atomic: the output data is written to the pipe as a contiguous
the kernel may interleave the data with data written by other
processes. POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)
I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.
It can block, though. If you use nonblocking writes, it could take
200 bytes, leaving the writer with 1800 unsent.

Basically, a way is needed to poll for the pipe having at least 2000
bytes of room, not just any positive amount of room.

If you could poll for that condition, then you would know that
a 2000 byte write will subsequently not fragment.

One way to poll for that condition would be to get a reply from the
reader (using a different pipe) to confirm it got everything.
The pipe is then known to be empty. The pipe is then essentially
being used as a message box.
Geoff Clare
2020-08-06 12:34:24 UTC
Permalink
Post by Kaz Kylheku
Post by Jorgen Grahn
POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)
I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.
It can block, though. If you use nonblocking writes, it could take
200 bytes, leaving the writer with 1800 unsent.
Not if PIPE_BUF is 4096. POSIX says that when O_NONBLOCK is set:

A write request for {PIPE_BUF} or fewer bytes shall have the
following effect: if there is sufficient space available in the
pipe, write() shall transfer all the data and return the number of
bytes requested. Otherwise, write() shall transfer no data and
return -1 with errno set to [EAGAIN].
--
Geoff Clare <***@gclare.org.uk>
Kaz Kylheku
2020-08-06 16:31:40 UTC
Permalink
Post by Geoff Clare
Post by Kaz Kylheku
Post by Jorgen Grahn
POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)
I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.
It can block, though. If you use nonblocking writes, it could take
200 bytes, leaving the writer with 1800 unsent.
A write request for {PIPE_BUF} or fewer bytes shall have the
following effect: if there is sufficient space available in the
pipe, write() shall transfer all the data and return the number of
bytes requested. Otherwise, write() shall transfer no data and
return -1 with errno set to [EAGAIN].
Does that mean that the descriptor also won't poll writable unless
there is PIPE_BUF space?

Or is it the case that you can get EAGAIN in spite of positive poll?
Scott Lurndal
2020-08-06 18:40:41 UTC
Permalink
Post by Kaz Kylheku
Post by Geoff Clare
Post by Kaz Kylheku
Post by Jorgen Grahn
POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)
I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.
It can block, though. If you use nonblocking writes, it could take
200 bytes, leaving the writer with 1800 unsent.
A write request for {PIPE_BUF} or fewer bytes shall have the
following effect: if there is sufficient space available in the
pipe, write() shall transfer all the data and return the number of
bytes requested. Otherwise, write() shall transfer no data and
return -1 with errno set to [EAGAIN].
Does that mean that the descriptor also won't poll writable unless
there is PIPE_BUF space?
Or is it the case that you can get EAGAIN in spite of positive poll?
The latter. By the type poll returns, the condition could no longer
be valid.
Kaz Kylheku
2020-08-06 19:04:09 UTC
Permalink
Post by Scott Lurndal
Post by Kaz Kylheku
Post by Geoff Clare
Post by Kaz Kylheku
Post by Jorgen Grahn
POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)
I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.
It can block, though. If you use nonblocking writes, it could take
200 bytes, leaving the writer with 1800 unsent.
A write request for {PIPE_BUF} or fewer bytes shall have the
following effect: if there is sufficient space available in the
pipe, write() shall transfer all the data and return the number of
bytes requested. Otherwise, write() shall transfer no data and
return -1 with errno set to [EAGAIN].
Does that mean that the descriptor also won't poll writable unless
there is PIPE_BUF space?
Or is it the case that you can get EAGAIN in spite of positive poll?
The latter. By the type poll returns, the condition could no longer
be valid.
That doesn't seem like a valid justification. If the polling
proces/thread is the only writer, and the pipe has not changed state in
any manner between the successful poll and the write attempt,
it cannot be the case that some condition no longer holds.

(Even if you held some absolute system-wide mutex that prevents all
other system activity between the two, including interrupts and
exeuction on other processors, you would still get the EAGAIN, right?)

Furthermore, if nobody else is writing (so as to pre-empt the use of
space) and there is read activity then between the time of a successful
write poll and the write operation, there can be only *more* room
available, not less.
Scott Lurndal
2020-08-06 20:18:54 UTC
Permalink
Post by Kaz Kylheku
Post by Kaz Kylheku
Post by Geoff Clare
Post by Kaz Kylheku
Post by Jorgen Grahn
POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)
I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.
It can block, though. If you use nonblocking writes, it could take
200 bytes, leaving the writer with 1800 unsent.
A write request for {PIPE_BUF} or fewer bytes shall have the
following effect: if there is sufficient space available in the
pipe, write() shall transfer all the data and return the number of
bytes requested. Otherwise, write() shall transfer no data and
return -1 with errno set to [EAGAIN].
Does that mean that the descriptor also won't poll writable unless
there is PIPE_BUF space?
Or is it the case that you can get EAGAIN in spite of positive poll?
The latter. By the time[Ed.] poll returns, the condition could no longer
be valid.
That doesn't seem like a valid justification. If the polling
proces/thread is the only writer, and the pipe has not changed state in
any manner between the successful poll and the write attempt,
it cannot be the case that some condition no longer holds.
That's a big if. It's not uncommon to have multiple writers.

A possible implementation may requiring allocation of a write
buffer on the write(1) system call, which may be temporarily unavailable for
arbitrary implementation-defined reason (memory pressure, for example),
in which case write(1) would return EAGAIN.
Kaz Kylheku
2020-08-07 06:53:33 UTC
Permalink
Post by Scott Lurndal
Post by Kaz Kylheku
Post by Kaz Kylheku
Post by Geoff Clare
Post by Kaz Kylheku
Post by Jorgen Grahn
POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)
I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.
It can block, though. If you use nonblocking writes, it could take
200 bytes, leaving the writer with 1800 unsent.
A write request for {PIPE_BUF} or fewer bytes shall have the
following effect: if there is sufficient space available in the
pipe, write() shall transfer all the data and return the number of
bytes requested. Otherwise, write() shall transfer no data and
return -1 with errno set to [EAGAIN].
Does that mean that the descriptor also won't poll writable unless
there is PIPE_BUF space?
Or is it the case that you can get EAGAIN in spite of positive poll?
The latter. By the time[Ed.] poll returns, the condition could no longer
be valid.
That doesn't seem like a valid justification. If the polling
proces/thread is the only writer, and the pipe has not changed state in
any manner between the successful poll and the write attempt,
it cannot be the case that some condition no longer holds.
That's a big if. It's not uncommon to have multiple writers.
It's not uncommon to have exactly one writer, either.

foo | grep bar
Post by Scott Lurndal
A possible implementation may requiring allocation of a write
buffer on the write(1) system call, which may be temporarily unavailable for
arbitrary implementation-defined reason (memory pressure, for example),
in which case write(1) would return EAGAIN.
If a prior poll said the descriptor is writable, and no other writer
stole the space, then the descriptor has space.

A typical descriptor like a serial TTY or socket will then take some of
the bytes into that space, and not return EAGAIN.

The the pipe-related requirement that a write fitting into PIPE_BUF
bytes or fewer must be entirely written or else EAGAIN is returned
breaks that semantics.
--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1
Scott Lurndal
2020-08-10 15:59:00 UTC
Permalink
Post by Kaz Kylheku
Post by Scott Lurndal
Post by Kaz Kylheku
Post by Kaz Kylheku
Post by Geoff Clare
Post by Kaz Kylheku
Post by Jorgen Grahn
POSIX.1-2001 requires PIPE_BUF to be at least 512
bytes. (On Linux, PIPE_BUF is 4096 bytes.)
I would expect 2000 bytes written in one go to a pipe which doesn't
block, to show up in one go, too.
It can block, though. If you use nonblocking writes, it could take
200 bytes, leaving the writer with 1800 unsent.
A write request for {PIPE_BUF} or fewer bytes shall have the
following effect: if there is sufficient space available in the
pipe, write() shall transfer all the data and return the number of
bytes requested. Otherwise, write() shall transfer no data and
return -1 with errno set to [EAGAIN].
Does that mean that the descriptor also won't poll writable unless
there is PIPE_BUF space?
Or is it the case that you can get EAGAIN in spite of positive poll?
The latter. By the time[Ed.] poll returns, the condition could no longer
be valid.
That doesn't seem like a valid justification. If the polling
proces/thread is the only writer, and the pipe has not changed state in
any manner between the successful poll and the write attempt,
it cannot be the case that some condition no longer holds.
That's a big if. It's not uncommon to have multiple writers.
It's not uncommon to have exactly one writer, either.
foo | grep bar
And if foo is multithreaded, foo may have several threads writing to the
pipe (e.g. via fprintf to stdout).
Post by Kaz Kylheku
Post by Scott Lurndal
A possible implementation may requiring allocation of a write
buffer on the write(1) system call, which may be temporarily unavailable for
arbitrary implementation-defined reason (memory pressure, for example),
in which case write(1) would return EAGAIN.
If a prior poll said the descriptor is writable, and no other writer
stole the space, then the descriptor has space.
Poll says that the pipe buffer contains less than PIPE_BUF bytes.

EAGAIN says the write couldn't start due to a resource issue. The resource
could be something _other_ than the pipe buffer or if, for example, the pipe
is implemented using a set of stream buffers (SVR4), allocation of a
stream buffer to the pipe may fail. Or some other kernel internal
reason may cause write to return EAGAIN. Just because poll indicates
that the pipe buffer isn't full doesn't mean a subsequent write to the
pipe won't recieve EAGAIN.

Joe Pfeiffer
2020-08-05 18:08:56 UTC
Permalink
Post by b***@nowhere.co.uk
On Tue, 4 Aug 2020 17:55:20 +0000 (UTC)
Post by Kaz Kylheku
In Unix, an example of a unidirectional port with two handles that have
a dedicated I/O direction is the pipe. A pipe is created using the
pipe() library function, which produces a pair of file descriptors: one
for reading and one for writing. They are connected to opposite ends of
the pipe, so to speak. A Unix pipe is also finite in size: if a writer
tries to place more than a certain amount of bytes into a pipe, the
operation will block. Unix pipes can also be specified in the
Not only that - pipes (on linux at least) don't guarantee to send even small
messages in one go. I found this out the hard way a few years back when
using pipes between processes and getting occasional obscure errors. It
turned out that despite the file descriptor mask bit being set in select() the
pipe did not contain the entire message at that point and so only part was
being read by the receiver. There is really no excuse for this in internal
kernel systems.
You pretty much can't depend on any read() call to deliver all your
bytes in one shot. The only safe way to do it is to put them in a
while() loop and increment a counter with the number of bytes read after
every read() (and also do appropriate error checking, of course).

UDP packets are the only ones I can think of off hand that guarantee an
all-or-nothing read().
Jorgen Grahn
2020-08-06 07:53:50 UTC
Permalink
On Wed, 2020-08-05, Joe Pfeiffer wrote:
...
Post by Joe Pfeiffer
You pretty much can't depend on any read() call to deliver all your
bytes in one shot. The only safe way to do it is to put them in a
while() loop and increment a counter with the number of bytes read after
every read() (and also do appropriate error checking, of course).
UDP packets are the only ones I can think of off hand that guarantee an
all-or-nothing read().
The SOCK_DGRAM Unix domain sockets too, and I guess SCTP if you want
to go that route.

/Jorgen
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Kaz Kylheku
2020-08-06 09:44:16 UTC
Permalink
Post by b***@nowhere.co.uk
On Tue, 4 Aug 2020 17:55:20 +0000 (UTC)
Post by Kaz Kylheku
In Unix, an example of a unidirectional port with two handles that have
a dedicated I/O direction is the pipe. A pipe is created using the
pipe() library function, which produces a pair of file descriptors: one
for reading and one for writing. They are connected to opposite ends of
the pipe, so to speak. A Unix pipe is also finite in size: if a writer
tries to place more than a certain amount of bytes into a pipe, the
operation will block. Unix pipes can also be specified in the
Not only that - pipes (on linux at least) don't guarantee to send even small
messages in one go.
That's really a flip-side of the same thing. If a pipe has 3 bytes
free out of 4096 (or whatever) then 3 bytes of your next message might
fit, and those 3 bytes are read as part of a 4096 byte read.
Post by b***@nowhere.co.uk
I found this out the hard way a few years back when
using pipes between processes and getting occasional obscure errors. It
turned out that despite the file descriptor mask bit being set in select() the
pipe did not contain the entire message at that point and so only part was
being read by the receiver. There is really no excuse for this in internal
kernel systems.
How would you fix it? If we accept the nature of the pipe, the only
alternative is to block the writer until there is enough room in the
pipe that the message will all fit, and reject the write if the message
is too large for the pipe. That would cause problems.

The alternative is to make the pipe a FIFO of write requests. The FIFO
can be restricted in length, but the requests may be of any size.

The implementation could refuse to copy, into kernel space, writes that
are above a certain size. So that is to say, small writes would be
copied into the FIFO and then the writer would be allowed to return. But
large writes could force the writer to block until they are fully
read by a reader; i.e. the two processes are forced to rendezvous and
the pipe code can then transfer the data directly from one address space
to the other without a kernel copy. In other words, the FIFO is
polymorphic with two kinds of entries: simple buffer copy, or address
space reference with blocked writer. That would thwart memory DoS
attacks, yet still allow one process to do two gigabyte write such that
another could read all of it in a single read after a positive select.

There are obvious problems integrating O_NONBLOCK semantics into this
approach, though, which seem damning.
--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1
b***@nowhere.co.uk
2020-08-06 15:02:43 UTC
Permalink
On Thu, 6 Aug 2020 09:44:16 +0000 (UTC)
Post by Kaz Kylheku
Post by b***@nowhere.co.uk
I found this out the hard way a few years back when
using pipes between processes and getting occasional obscure errors. It
turned out that despite the file descriptor mask bit being set in select()
the
Post by b***@nowhere.co.uk
pipe did not contain the entire message at that point and so only part was
being read by the receiver. There is really no excuse for this in internal
kernel systems.
How would you fix it? If we accept the nature of the pipe, the only
alternative is to block the writer until there is enough room in the
pipe that the message will all fit, and reject the write if the message
is too large for the pipe. That would cause problems.
The write()'s were all returning having send the complete message in one
go. However the kernel appeared to be fragmenting the messages so that
they were not being read by read() in one go in the other process. This has to
be a design fault or even a bug.
Kaz Kylheku
2020-08-04 18:21:06 UTC
Permalink
Post by G G
to a separate reply port. Each port may have multiple senders, but
only one receiver.
This restriction is badly stated.

Without saying so explicitly, it is taking the point of view that a port
is actually a message box belonging to the receiver, and that the
definition of a "receiver" is that it is a service which receives
requests, performs some actions, and produces replies.

Of course, multiple clients use the service.

In reality, even under the above assumptions, such a service can be
performed using multiple processes, which all wait for a message from
the same queue. The sender does not know or care which of those
processes services the message.

It would be better to describe the semantics of a port itself without
tying it to a particular intended scenario like multiple client sharing
the same service using a send/receive/reply discipline.

If the communication port has no *inherent* addressing mechanism, then
it is restricted to a single logical receiver and sender.

So that is to say if we can do only:

send(port, message)

and

message = receive(port)

without any addressing arguments about who is sending the message
and who is to receive it, then to the port itself, all senders look
the same and all receivers look the same.

What the book is getting at is the message box model: each task in the
system has its own port from which only it receives. This means that
if we do:

send(port, message)

the send operation is *implicitly addressed* by the fact that we really
did this:

port = intended_recipient.message_box

send(port, message)

Now what is missing is that since we expect a reply to our message like
"your request was done", we have to include our address. If the port
doesn't know about that, we have to encode it in the messaage.

# get the message box of the "intended_recipient" task
service_port = intended_recipient.message_box

# indicate our own message box in the request
message.reply_to_port = THIS_THREAD.message_box

# send the request
send(service_port, message)

# wait for reply no our own message box
reply = receive(THIS_THREAD.message_box)

Then the receiver, the "intended_recipient", can do this:

# receive the request
message = receive(THIS_THREAD.message_box)

# do the request and produce a reply
reply = do_the_operation(message)

# send reply to the port indicated in the request
send(message.reply_to_port, reply)

But this idea that tasks have a message box isn't the only possible use
case for these ports. For instance, a group of tasks could constitute a
service, and then that service entity will have a single message box.

The sender then does:

# get the message box of the foo service:
service_port = foo_service.port

The service has multiple tasks doing this at the same time:

message = receive(foo_service.port)

It's not even the case that the senders have to use their own message
box. The requests could be made using multiple threads on behalf of some
subsystem in which there is a single reply port for all the request
completions. So those threads specify that reply port, instead of their
own private "message box" port.
--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1
Loading...