This article is a short tutorial showing GIO’s file reading and writing functionality.
Quick recap: GIO is an IO framework made for the GLib stack. File and directory operations are carried out through the GFile class, which handles file/directory/path manipulation functions.
In this tutorial we will do simple file read and write operations. The GIO reference documentation puts this under the Streaming IO category.
>>> import gio >>> f = gio.File('test.txt') >>> stream = f.replace(None, False) # This should work, but doesn't. Traceback (most recent call last): ... TypeError: gio.File.replace() argument 1 must be string, not None >>> stream = f.replace('', False) # Workaround >>> stream.write("Hello, World!") 13 >>> stream.close() True >>> stream = f.read() >>> stream.read() 'Hello, World!' >>> stream.close() True
Diving in, lazy version
>>> import gio >>> f = gio.File('test.txt') >>> f.replace_contents("Hello, World!") '...' >>> contents, length, etag = f.load_contents() >>> contents 'Hello, World!'
Explanation: file output
To write to a file, first we need to create a GFile with the location of the file.
>>> f = gio.File('test.txt')
Next we need to create an output stream to the file. You can use either the
append_to method depending on what you’re trying to do; they correspond respectively to the “write” and “append” modes in standard IO terminology. We’ll use
replace in this example.
>>> stream = f.replace(None, False) # This should work. Traceback (most recent call last): ... TypeError: gio.File.replace() argument 1 must be string, not None
Oops, that didn’t work. It’s a bug in the PyGIO version that I’m using, so for now we’ll work around it by passing in an empty string. It seems to work fine, but I won’t guarantee anything.
>>> stream = f.replace('', False)
For explanations regarding the function arguments, refer to the GFile documentation (C reference, Python reference). A nice but possibly unexpected feature of
replace is that it handles file replacements safely, using temporary files and file renames.
The result that we get is a GOutputStream (C reference, Python reference). Specifically, it’s a GFileOutputStream, but you normally won’t use that interface. GOutputStream resembles a file object in C or Python. You write to the stream using its
>>> stream.write("Hello, World!") 13
Don’t forget to close the file once you’re done. Failing to do this for a local file is bad enough, but for a remote file it can be disastrous. If you want to be more careful, you should wrap all the stream operations in a try-finally block, closing the stream in the “finally” clause.
>>> stream.close() True
Note that, if you’re really lazy, you can replace the whole replace/write/close calls with one call to
>>> f.replace_contents("Hello, World!") '...'
Explanation: file input
>>> stream = f.read()
GInputStream also has a method called
read, which is used to actually read data from the stream. You can read the file progressively if you want to, but for this example we’ll just read the whole file.
>>> stream.read() 'Hello, World!'
read method in the Python API actually corresponds to
read_all in the C API, while Python’s
read_part corresponds to
read in C. As to the difference between
read_part), I have no idea. Have a look at the GInputStream documentation and see if you can spot the difference between the two in terms of functionality (I know they have different signatures).
Anyway, close the stream when you’re done.
>>> stream.close() True
You can also replace the whole thing with a single call to GFile’s
>>> contents, length, etag = f.load_contents() >>> contents 'Hello, World!'
GIO reads and writes files in raw bytes format, which means everything is passed on without any encoding/decoding.
This makes for an important gotcha while writing to a file in the Python GIO API. You cannot write an arbitrary unicode string using PyGIO and expect it to work; you need to encode the string to a specific encoding first.
>>> f.replace_contents(u"\u00c5".encode('utf-8')) # Correct '...' >>> f.replace_contents(u"\u00c5") # Incorrect Traceback (most recent call last): ... UnicodeEncodeError: 'ascii' codec can't encode character u'\xc5' in position 0: ordinal not in range(128)
As a rule of thumb, never pass a unicode object to be written to a file.
assert isinstance(some_object, str) makes for a good way to check for this kind of mistake during development.
While reading a file, GIO will also return raw strings. In PyGIO, you will get an str object with no obvious way to turn it into unicode. I’m guessing that for some types of streams (e.g. HTTP) the encoding information is available in the GInputStream object, but so far I haven’t figured out how to get this data.
If you’re used to reading/writing text files in Python or Perl, there’s another gotcha that you should know. Remember that GIO writes files in raw bytes format. It doesn’t translate to/from the platform-specific newline characters. That means, if you write “\n” to a file it will be written as a line feed character (0x0A), even in Windows and MacOS.
If you need to write a platform-specific newline character, you can use Python’s
There’s not much more to this topic. Seeking is done through the GSeekable interface, which not all streams implement. As with many other GIO functions, stream read/write methods have asynchronous counterparts and can be cancelled.
If you’re curious, try reading and writing a remote file. They should just work.