In the last section we looked at sending a message through a pipe and everything worked great. However there was a porblem, and this becomes apparent if we alter the sending program and make it run a little slower, perhaps by adding a short sleep in between sending each letter of the message, something like this:
import time
message = "hello to a pipe\n"
with open("my_pipe", "w") as f:
print "have opened pipe, commencing writing...."
for c in message:
f.write(c)
print "have sent a letter"
time.sleep(1)
if we run this program and attempt to read from my_pipe
in a different virtual
terminal with cat my_pipe
we’ll see that we need to show a bit of patience!
The terminal running the Python process will periodically print reassuring
messages saying it has sent letters, and yet we’ll see no letters shown on the
virtual terminal running cat my_pipe
, at least not until the end of the
message is reached, at which point the whole message will suddenly appear.
The root of this problem is buffering – rather than being sent straight through
the pipe, Python’s IO, by default, buffers the data, and this data is not
flushed until the EOF
(end of file) signal is encountered.
Looking at the manual page for Python’s
open
function we see
that there is a way to control the size of this buffer, through the buffering
argument. buffering
has a special meaning when it is set equal to 0, where
no buffering is done at all. Altering our slow write to have buffering = 0
:
# write_to_pipe_buf_0.py
import time
message = "hello to a pipe\n"
with open("my_pipe", "w", 0) as f:
print "have opened pipe, commencing writing...."
for c in message:
f.write(c)
print "have sent a letter"
time.sleep(1)
and running again, while attempting to read again with cat my_pipe
in another
virtual terminal, we can see the effect of altering the buffering; each letter
gets shown one at a time.
The manual page for
open
also mentions a
special value of buffering = 1
, which is line buffered. This does what its
name implies, as can be seen by running this program, which has our message
broken down into lines, while reading from a pipe:
# write_to_pipe_buf_1.py
import time
message = "hello\nto a\npipe\n"
with open("my_pipe", "w", 1) as f:
print "have opened pipe, commencing writing...."
for c in message:
f.write(c)
print "have sent a letter"
time.sleep(1)
Other values of buffering
change the size of the chunks in which the data is
sent across the pipe, see for example what effect a buffering
of 4
has on
this program:
# write_to_pipe_buf_4.py
import time
message = "hello to a pipe\n"
with open("my_pipe", "w", 4) as f:
print "have opened pipe, commencing writing...."
for c in message:
f.write(c)
print "have sent a letter"
time.sleep(1)
This program sends the message across the pipe in chunks of approximately 4 bytes.
The natural counterpart to a Python program that does unbuffered writing is one that does unbuffered reading, so let’s write one now.
Firstly, trying our previous program for reading from a pipe,
read_from_pipe.py
,
(by running write_to_pipe_buf_0.py
in one virtual
terminal, and
read_from_pipe.py
in another)
we see that it doesn’t display the message being received letter by letter, but
that it instead blocks, the problem being that the iterator through the pipe’s
FILE
object, for l in f
, doesn’t commence until f
has reached the EOF
.
To read just a set number of bytes from a file
object we need to try a
slightly different approach, using the
f.read(x)
method which blocks until it reads x
bytes from f
or reaches EOF
,
returning either a string of the data used or None
, if the file
object is
finished.
Feeling our way towards solving the problem, a program like this:
import sys
bufsize = 1
with open("my_pipe", "r") as f:
print "have opened pipe, commencing reading...."
c = f.read(bufsize)
print c,
will read from 1 character from the pipe, print it and then the Python program
finishes. When we run it with write_pipe_buf_0.py
we get the first character of the message printed as soon as it is available in
the pipe. Increasing bufsize
we see that we can alter the number of characters
that are printed,
read
blocking
until that many characters have been sent to the pipe.
Another bit of behaviour that should be noted is that when the process reading
from the pipe closes before the message being written has finshed being written
the writing process crashes, throwing an
IOError
exception complaining of a broken pipe. We won’t concern ourselves with that
here.
If we want to read a message through a pipe letter by letter, then, we need to
loop, reading a character and, whenever one is available, printing it and waiting
for the next character. We break from the loop whenever read
returns None
,
i.e. when the message has ended. In short, something like this:
with open("my_pipe", "r") as f:
print "have opened pipe, commencing reading...."
while True:
c = f.read(1)
if c:
print c,
else:
break
The only problem with this is it doesn’t work! Once again the message won’t
display until it has completely sent and the pipe closed. The problem this time
is write-buffered IO at the operating system level. When writing to stdout
(as
print
does) the output is buffered, i.e. will only display when enough
characters have been printed. Once again the message won’t display until it has
completely sent and the pipe closed. The problem this time is write-buffered IO
at the operating system level. When writing to stdout
(as print
does) the
output is buffered, i.e. will only display when enough characters have been
printed.
To circumvent this we need to go a little lower level and use Python’s sys
module.
sys.stdout
is a file
object that represents Unix’s stdout
pipe, i.e. the pipe that when
written to gets displayed on the screen.
Doing something like:
import sys
sys.stdout.write("hello world")
sys.stdout.flush()
will write a string to stdout
and then flush it, i.e. force what’s been
written to be displayed on the screen. With this in mind by changing our
unbuffered reading program to this:
import sys
with open("my_pipe", "r") as f:
print "have opened pipe, commencing reading...."
while 1:
c = f.read(1)
if c:
sys.stdout.write(c)
sys.stdout.flush()
else:
break
and running this in concert with write_pipe_buf_0.py
in another buffer we see that it works how we want, the message is displayed
letter by letter, as it is received!