Tag Archives: programming

On the search for a Make replacement

Make is great, but there are some issues with it that are probably impossible to fix now. So I’ve been looking for a replacement that I can use for simple task automation; surely in the 40+ years of Make’s lifetime someone has written something better, right?

These are the notes I made while evaluating the different options that I explored. I’m interested to see if anyone has comments, corrections, or other suggestions.

Make

  • Implemented in: Various programming languages (probably mainly C)
  • Script language: Make (various dialects)
  • Metaprogramming: Yes (major implementations)
  • Biggest issue: Stringly typed

Good old Make. Very nice for all sorts of tasks, until you need to deal with files containing space and/or quote characters, where things start to go downhill.

Ninja

  • Implemented in: C++
  • Script language: Ninja
  • Metaprogramming: No
  • Biggest issue: Not programmable

This is only mentioned for completion because Ninja is in a completely different ballpark. It’s meant to be a target language for buildfile generators like CMake and Meson, so by itself it has zero programmability, and you wouldn’t really want to write it by hand.

Just

  • Implemented in: Rust
  • Script language: Just (Make-like)
  • Metaprogramming: No
  • Biggest issue: Stringly typed

This feels like a Make variant with fewer features than most Make implementations. I don’t see this as a practical choice for any project, at least right now.

Task

  • Implemented in: Go
  • Script language: YAML + Go template
  • Metaprogramming: No
  • Biggest issue: Stringly typed

I actually really like the idea of Task. For very simple use cases it’s very elegant, because its whole syntax is just 100% valid YAML with a bit of string templating. The templating sometimes gets in the way, though, because the use of {{.VAR}} for variables conflicts with YAML’s {a: b} mapping construct, forcing you to waste one level of string quoting on it.

A bigger flaw is that there’s no easy way to override variables from the command line. I think you can work around this by jumping through scripting hoops, but then you lose a ton of elegance points.

And the biggest flaw: it’s still stringly typed just like Make, so you’ll have trouble separating strings from lists.

Gulp

  • Implemented in: JavaScript
  • Script language: JavaScript
  • Metaprogramming: Yes
  • Biggest issue: Doesn’t track outputs?

This looked promising until I noticed that outputs didn’t seem to be tracked anywhere, which means everything gets rebuilt all the time. Is this really the case or am I missing something?

Grunt

  • Implemented in: JavaScript
  • Script language: JavaScript
  • Metaprogramming: Yes
  • Biggest issue: Convoluted just to get started

Grunt’s documentation is rather bad, and the examples they have all throw you in the deep, ugly end. Skimming these introductory materials, I couldn’t figure out how to write the simplest build file, which seems a bad sign.

Rake

  • Implemented in: Ruby
  • Script language: Ruby
  • Metaprogramming: Yes
  • Biggest issue: Ruby’s shell module is broken on Windows

This one looked very promising. I’m not a fan of Ruby, but was willing to put up with it because Rake seemed to do all I wanted. But then I discovered that Ruby’s most reasonable subprocess handler, the shell module, breaks on Windows. Without it, you’re back to various ugly half-baked APIs, each with their own limitations.

SCons and Waf

  • Implemented in: Python
  • Script language: Python
  • Metaprogramming: Yes
  • Biggest issue: Not designed for simple tasks

Two competing Python-based build systems. These seem too complicated for my use cases. I think making them suit simple tasks would be a significant undertaking. Or perhaps I’m just missing a documentation that is not mainly targeted at people trying to create a build pipeline for their C projects.

doit

  • Implemented in: Python
  • Script language: Python
  • Metaprogramming: Yes
  • Biggest issue: Verbose

With this one you end up with lots of boilerplate because rather than writing tasks, you’re writing task creators. It makes sense, but it feels like doing things at a too-low level when you want it to be a simple Make alternative.

The author suggests several high-level interfaces that can be implemented on top of doit. They do limit what you can do, but you can always write normal doit task creators in addition to the simplified versions. I think this is a reasonable compromise and I particularly like the decorator version.

The only remaining problem, then, is that Python’s subprocess handling is very cumbersome. There are two libraries I know of that can rectify this: sh and Plumbum. sh, in my opinion, is not suitable for use in a Make replacement use case. The way it does piping by default is not in line with what we expect, coming from Make. Plumbum is not perfect but better (you still have to end everything with .run_fg() or the magical & FG).

A quirk of doit is that it creates a cache file (or files) alongside your build file. Depending on the exact database backend used, it can create up to three files, which I’d say is not ideal.

Conclusion

I have for now settled on doit + Plumbum with around 100 lines of support code. I’m not fully happy with this, and I’m not sure it can cover all my use cases, but I think it’s time for me to put my ideas and investigations out there and seek comments.

Rake is almost what I need, if not for what I believe is a bug in Ruby’s standard library. But even if it’s fixed, I’d prefer to stick with a Python-based solution if possible.

Transliterating arbitrary text into Latin script

This post explores one of the capabilities of the PyICU library, namely its text transformation module. Specifically, we’ll look at the simplest use case: transliterating text into Latin script.

Say you are given a list of phrases, names, titles, whatever, in a script that you can’t read. You want to be able to differentiate them, but how, when they all look like random lines and curves? Well, let’s turn them into Latin characters!

>>> import icu
>>> tr = icu.Transliterator.createInstance('Any-Latin; Title').transliterate
>>> tr('Ἀριστοτέλης, Πλάτων, Σωκράτης')
'Aristotélēs, Plátōn, Sōkrátēs'

There we go. Even if you still can’t pronounce these names correctly, at least they’re hopefully easier to recognise because they are now in a script that you are more used to reading (unless you’re Greek, of course).

'Any-Latin; Title' means we want to transliterate from any script to Latin, then convert it to title case. If that’s too simple, the ICU documentation has the gory details of all the supported transforms.

Easy, no?

Caveats

Do not rely on the output as pronunciation guide unless you know what you’re doing. For example, the Korean character 꽃 is transliterated by ICU as kkoch to keep it reversible, even though the word certainly does not sound like the gunmaker’s nor Kochie’s last names, and definitely not like the synonym for rooster (the modern romanisation, which matches closer to the correct pronunciation, is kkot).

The transliteration of Han characters (shared between Chinese, Japanese, and Korean) uses Chinese Pinyin, and thus may not resemble the Japanese and Korean romanisations at all. This makes the transliteration of many Japanese texts particularly awful.

>>> tr('日本国')  # "Nippon-koku" = Japan
'Rì Běn Guó'

Oops, that could start an Internet war. Use a different library if you are primarily dealing with Japanese text.

Another unfortunate thing with ICU is that there are still scripts that it doesn’t support at all. For example, it can’t transliterate to/from Javanese.

>>> tr('ꦫꦩꦏꦮꦸꦭꦲꦶꦁꦱ꧀ꦮꦂꦒ')
'ꦫꦩꦏꦮꦸꦭꦲꦶꦁꦱ꧀ꦮꦂꦒ'

Maybe one day.

Using GitLab’s CI server

GitLab provides a continuous integration server, which is pretty nice for building, testing, and packaging your software and having all the UI integrated in GitLab. If you’re just using the free GitLab.com hosting, you get to utilise their Docker-based runner. (If your build process requires a non-Linux OS you’ll have to provide your own runner.)

Getting a basic build up and running is pretty simple. For example, here’s one job named test that only runs make check:

test:
  script:
    - make check

If your test suite can measure code coverage, GitLab can also show it in the UI. At the moment this feature is rather rudimentary and requires you to go to the project settings and enter a regular expression to find the coverage amount in the build output.

The following is an example that works with coverage.py when you only have a single Python file. I haven’t tried it with multiple files; it may require a wrapper script that calculates the total coverage amount.

test:
  image: python:3-alpine
  script:
    - pip install coverage
    - coverage run foo.py
    - coverage report -m

# Regex that matches the coverage amount:
# ^\S+\.py\s+\d+\s+\d+\s+(\d+\%)

A few lessons learnt from setting up test hooks for a small Python app:

  • There is no way to test changes to your build script without pushing a commit. And then the build results will stay in your project page forever with no way to clean up the irrelevant ones. You can run builds locally with gitlab-runner, e.g. gitlab-runner exec shell test to run the test job on the local shell (replace shell with docker to use Docker). (Thanks Evan Felix for the info about gitlab-runner.)
  • GitLab.com’s default Docker image (ruby:2.1 at the time of writing) is really fast to spin up, possibly because it’s cached. However, you should still explicitly name a Docker image in case the default changes.
  • Installing packages is slower than downloading a Docker image. It’s not worth going out of your way to use the default image if you then have to call apt-get. Try to find a pre-built Docker image that has all the packages you need instead. However, we’re talking about differences of about ten seconds, so just choose the image that is most convenient.
  • The ruby:2.1 image has Python 2 but not Python 3.
  • Docker’s Python repository lists a number of Alpine-based images. As you would expect, these are smaller and slightly faster to download than the other (Debian-based) images.
  • The coverage regex requires the percent sign to be escaped (\%).

CMake’s ugly programming language

I’ve just discovered Rosetta Code not long ago, and found it quite fun to browse around. It shows you the code for various programming tasks in different programming languages. While looking at the Quicksort page, I noticed that it didn’t have a CMake version, so I decided to try writing one.

function (quicksort array_var)
    set (array ${${array_var}})
    if ("${array}" STREQUAL "")
        return ()
    endif ()

    set (less)
    set (equal)
    set (greater)
    list (GET array 0 pivot)

    foreach (x ${array})
        if (x LESS pivot)
            list (APPEND less "${x}")
        elseif (x EQUAL pivot)
            list (APPEND equal "${x}")
        else ()
            list (APPEND greater "${x}")
        endif ()
    endforeach ()

    set (array)
    if (NOT less STREQUAL "")
        quicksort (less)
        list (APPEND array ${less})
    endif ()
    list (APPEND array ${equal})
    if (NOT greater STREQUAL "")
        quicksort (greater)
        list (APPEND array ${greater})
    endif ()
    set ("${array_var}" ${array} PARENT_SCOPE)
endfunction ()

set (a 4 65 2 -31 0 99 83 782 1)
quicksort (a)
message ("${a}")

I’ve worked with CMake for years, and I think it’s a good build system, but I really wish it had switched to a saner language. The CMake language is actually pretty simple and consistent at the syntax level: everything is in the form command(string), where the string syntax is slightly confusing but still rather understandable once you’ve figured out the quoting and variable expansion mechanisms. It’s how that string argument is used that can be messy, inconsistent, and ambiguous. Effectively, it’s as if every command had its own syntax.

Around 2008, there was an experiment to allow writing CMake scripts in Lua. The project never caught on and was abandoned. I think part of the reason was that the thread discussing it in Lua’s mailing list was single-handedly derailed into pointless bickering (which reminds me of the “poisonous people” talk).

CMake is stuck with a mediocre programming language for the foreseeable future. It’s not as bad as it sounds, though. The simplicity of the syntax has its advantages, and writing CMake buildfiles rarely gets frustrating. It would make a terrible general programming language, but as a build system scripting language it’s workable.

Win32 Python: Getting all window titles

This post shows how you can retrieve all window titles in Microsoft Windows using Python’s ctypes module. Moreover, it also acts as a ctypes tutorial, showing how to create and use callback functions.

The following is the full code. Keep reading if you want to understand how it works. (Note: If you are reading this as a ctypes tutorial and are having trouble following the explanation, you may want to go through my previous tutorial first.)

import ctypes

EnumWindows = ctypes.windll.user32.EnumWindows
EnumWindowsProc = ctypes.WINFUNCTYPE(ctypes.c_bool, ctypes.POINTER(ctypes.c_int), ctypes.POINTER(ctypes.c_int))
GetWindowText = ctypes.windll.user32.GetWindowTextW
GetWindowTextLength = ctypes.windll.user32.GetWindowTextLengthW
IsWindowVisible = ctypes.windll.user32.IsWindowVisible

titles = []
def foreach_window(hwnd, lParam):
	if IsWindowVisible(hwnd):
		length = GetWindowTextLength(hwnd)
		buff = ctypes.create_unicode_buffer(length + 1)
		GetWindowText(hwnd, buff, length + 1)
		titles.append(buff.value)
	return True
EnumWindows(EnumWindowsProc(foreach_window), 0)

print(titles)

Continue reading

If you get a PUSHL-related error…

If you’re compiling a piece of code and getting an error message saying something about PUSHL or “invalid suffix for PUSH”, it means you’re feeding x86 assembly code to an x86_64 assembler.

Possible causes:

  • The code you’re trying to compile really does contain x86 assembly, perhaps inlined from C, but you’re targeting x86_64.
  • Your toolchain somehow contains a mix of x86 compiler and x86_64 assembler. Try cleaning up your PATH.

strxfrm in Vala

Here, let me save you some time: string.collate_key.

I spent hours trying to get Posix.strxfrm to work and still failed. It was actually so much easier to write a simple strxfrm wrapper in C and use it as an external function. While admiring my work on this wrapper, I found GLib’s g_utf8_collate_key, which in Vala translates to the aforementioned string.collate_key.

By the way, I’ve just realised that strxfrm stands for string transform.