k1lib

PyTorch is awesome, and it provides a very effective way to execute ML code fast. What it lacks is surrounding infrastructure to make general debugging and discovery process better. Other more official wrapper frameworks sort of don't make sense to me, so this is an attempt at recreating a robust suite of tools that makes sense.

Table of contents:

Let's see an example:

Overview

k1lib.imports is just a file that imports lots of common utilities, so that importing stuff is easier and quicker.

Here is our network. Just a normal feed-forward network, with skip blocks in the middle.

Here is where things get a little more interesting. k1lib.Learner is the main wrapper where training will take place. It has 4 basic parameters that must be set before training: model, data loader, optimizer, and loss function.

Tip: docs are tailored for each object so you can do print(obj) or just obj in a code cell

There're lots of Callbacks. What they are will be discussed later, but here's a tour of a few of them:

ParamFinder

As advertised, this callback searches for a perfect parameter for the network.

Loss

Data type returned is k1lib.viz.SliceablePlot, so you can zoom the plot in a specific range, like this:

Notice how original train range is [0, 250], and valid range is [0, 60]. When sliced with [120:], train's range sliced as planned from the middle to end, and valid's range adapting and also sliced from middle to end ([30:]).

LossLandscape

Oh and yeah, this callback can give you a quick view into how the landscape is. The center point (0, 0) is always the lowest portion of the landscape, so that tells us the network has learned stuff.

HookParam

This tracks parameters' means, stds, mins and maxs while training. You can also display only certain number of parameters:

HookModule

Pretty much same thing as before. This callback hooks into selected modules, and captures the forward and backward passes. Both HookParam and HookModule will only hook into selected modules (by default all is selected):

CSS module selector

You can select specific modules by setting l.css = ..., kinda like this:

Essentially, you can:

Different callbacks will recognize certain props. HookModule will hook all modules with props "all" or "HookModule". Likewise, HookParam will hook all parameters with props "all" or "HookParam".

Data loader

It's simple, really! l.data contains a train and valid data loader and each has multiple ways to unpack values.

Callbacks

Let's look at l again:

l.model and l.opt is simple enough. It's just PyTorch's primitives. The part where most of the magic lies is in l.cbs, an object of type k1lib.Callbacks, a container object of k1lib.Callback. Notice the final "s" in the name.

A callback is pretty simple. While training, you may want to sort of insert functionality here and there. Let's say you want the program to print out a progress bar after each epoch. You can edit the learning loop directly, with some internal variables to keep track of the current epoch and batch, like this:

startTime = time.time()
for epoch in epochs:
    for batch in batches:
        # do training
        data = getData()
        train(data)

        # calculate progress
        elapsedTime = time.time() - startTime
        progress = round((batch / batches + epoch) / epochs * 100)
        print(f"\rProgress: {progress}%, elapsed: {round(elapsedTime, 2)}s         ", end="")

But this means when you don't want that functionality anymore, you have to know what internal variable belongs to the progress bar, and you have to delete it. With callbacks, things work a little bit differently:

class ProgressBar(k1lib.Callback):
    def startRun(self):
        pass
    def startBatch(self):
        self.progress = round((self.batch / self.batches + self.epoch) / self.epochs * 100)
        a = f"Progress: {self.progress}%"
        b = f"epoch: {self.epoch}/{self.epochs}"
        c = f"batch: {self.batch}/{self.batches}"
        print(f"{a}, {b}, {c}")

class Learner:
    def run(self):
        self.epochs = 1; self.batches = 10

        self.cbs = k1lib.Callbacks()
        self.cbs.append(ProgressBar())

        self.cbs("startRun")
        for self.epoch in self.epochs:
            self.cbs("startEpoch")
            for self.batch in self.batches:
                self.xb, self.yb = getData()
                self.cbs("startBatch")

                # do training
                self.y = self.model(data); self.cbs("endPass")
                self.loss = self.lossF(self.y); self.cbs("endLoss")
                if self.cbs("startBackward"): self.loss.backward()

                self.cbs("endBatch")
            self.cbs("endEpoch")
        self.cbs("endRun")

This is a stripped down version of k1lib.Learner, to get the idea across. Point is, whenever you do self.cbs("startRun"), it will run through all k1lib.Callback that it has (ProgressBar in this example), check if it implements startRun, and if yes, executes it.

Inside ProgressBar's startBatch, you can access learner's current epoch by doing self.learner.epoch. But you can also do self.epoch alone. If the attribute is not defined, then it will automatically be searched inside self.learner.

As you can see, if you want to get rid of the progress bar without using k1lib.Callbacks, you have to delete the startTime line and the actual calculate progress lines. This requires you to remember which lines belongs to which functionality. If you use the k1lib.Callbacks mechanism instead, then you can just uncomment self.cbs.append(ProgressBar()), and that's it. This makes swapping out components extremely easy, repeatable, and beautiful.

Other use cases include intercepting at startBatch, and push all the training data to the GPU. You can also reshape the data however you want. You can insert different loss mechanisms (endLoss) in addition to lossF, or quickly inspect the model output. You can also change learning rates while training (startEpoch) according to some schedules. The possibility are literally endless.