GIO tutorial: Stream IO

This article is a short tutorial showing GIO’s file reading and writing functionality.

Note: This article is part of my ongoing GIO tutorial series, which currently consists of: (1) File and path operations; (2) File IO (this document); and (3) Launching files.

Quick recap: GIO is an IO framework made for the GLib stack. File and directory operations are carried out through the GFile class, which handles file/directory/path manipulation functions.

In this tutorial we will do simple file read and write operations. The GIO reference documentation puts this under the Streaming IO category.

Diving in

>>> import gio
>>> f = gio.File('test.txt')
>>> stream = f.replace(None, False) # This should work, but doesn't.
Traceback (most recent call last):
  ...
TypeError: gio.File.replace() argument 1 must be string, not None
>>> stream = f.replace('', False) # Workaround
>>> stream.write("Hello, World!")
13
>>> stream.close()
True
>>> stream = f.read()
>>> stream.read()
'Hello, World!'
>>> stream.close()
True

Diving in, lazy version

>>> import gio
>>> f = gio.File('test.txt')
>>> f.replace_contents("Hello, World!")
'...'
>>> contents, length, etag = f.load_contents()
>>> contents
'Hello, World!'

Explanation: file output

To write to a file, first we need to create a GFile with the location of the file.

>>> f = gio.File('test.txt')

Next we need to create an output stream to the file. You can use either the replace or append_to method depending on what you’re trying to do; they correspond respectively to the “write” and “append” modes in standard IO terminology. We’ll use replace in this example.

>>> stream = f.replace(None, False) # This should work.
Traceback (most recent call last):
  ...
TypeError: gio.File.replace() argument 1 must be string, not None

Oops, that didn’t work. It’s a bug in the PyGIO version that I’m using, so for now we’ll work around it by passing in an empty string. It seems to work fine, but I won’t guarantee anything.

>>> stream = f.replace('', False)

For explanations regarding the function arguments, refer to the GFile documentation (C reference, Python reference). A nice but possibly unexpected feature of replace is that it handles file replacements safely, using temporary files and file renames.

The result that we get is a GOutputStream (C reference, Python reference). Specifically, it’s a GFileOutputStream, but you normally won’t use that interface. GOutputStream resembles a file object in C or Python. You write to the stream using its write method.

>>> stream.write("Hello, World!")
13

Don’t forget to close the file once you’re done. Failing to do this for a local file is bad enough, but for a remote file it can be disastrous. If you want to be more careful, you should wrap all the stream operations in a try-finally block, closing the stream in the “finally” clause.

>>> stream.close()
True

Note that, if you’re really lazy, you can replace the whole replace/write/close calls with one call to replace_contents.

>>> f.replace_contents("Hello, World!")
'...'

Explanation: file input

To read from a file, first create a GInputStream (C reference, Python reference) using GFile’s read method.

>>> stream = f.read()

GInputStream also has a method called read, which is used to actually read data from the stream. You can read the file progressively if you want to, but for this example we’ll just read the whole file.

>>> stream.read()
'Hello, World!'

Confusingly, the read method in the Python API actually corresponds to read_all in the C API, while Python’s read_part corresponds to read in C. As to the difference between read_all (Python: read) and read (Python: read_part), I have no idea. Have a look at the GInputStream documentation and see if you can spot the difference between the two in terms of functionality (I know they have different signatures).

Anyway, close the stream when you’re done.

>>> stream.close()
True

You can also replace the whole thing with a single call to GFile’s load_contents method.

>>> contents, length, etag = f.load_contents()
>>> contents
'Hello, World!'

Encoding matters

GIO reads and writes files in raw bytes format, which means everything is passed on without any encoding/decoding.

This makes for an important gotcha while writing to a file in the Python GIO API. You cannot write an arbitrary unicode string using PyGIO and expect it to work; you need to encode the string to a specific encoding first.

>>> f.replace_contents(u"\u00c5".encode('utf-8')) # Correct
'...'
>>> f.replace_contents(u"\u00c5") # Incorrect
Traceback (most recent call last):
  ...
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc5' in position 0: ordinal not in range(128)

As a rule of thumb, never pass a unicode object to be written to a file. assert isinstance(some_object, str) makes for a good way to check for this kind of mistake during development.

While reading a file, GIO will also return raw strings. In PyGIO, you will get an str object with no obvious way to turn it into unicode. I’m guessing that for some types of streams (e.g. HTTP) the encoding information is available in the GInputStream object, but so far I haven’t figured out how to get this data.

Newline matters

If you’re used to reading/writing text files in Python or Perl, there’s another gotcha that you should know. Remember that GIO writes files in raw bytes format. It doesn’t translate to/from the platform-specific newline characters. That means, if you write “\n” to a file it will be written as a line feed character (0x0A), even in Windows and MacOS.

If you need to write a platform-specific newline character, you can use Python’s os.linesep.

What now?

There’s not much more to this topic. Seeking is done through the GSeekable interface, which not all streams implement. As with many other GIO functions, stream read/write methods have asynchronous counterparts and can be cancelled.

If you’re curious, try reading and writing a remote file. They should just work.

Advertisements

One thought on “GIO tutorial: Stream IO

Note: By commenting, you grant me permission to freely republish your comment.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s