k1lib.mo module

This module is quite dope. It essentially allows you to construct, explore and simulate molecules (hence the "mo") quite easily. I suggest opening the official docs in another tab for reference. Let's get started.

Making molecules

So the basis for everything is the Atom class. There are several builtin substances:

And there are some complicated substances too:

You can create a new substance really easily:

That's it. Remember that each time you're accessing the substance, it'll create a new Atom entirely:

For complex substances, it will still return a single Atom, not some other weird data structures:

Let's see methane:

The highlighted circle is the current Atom we're showing the molecule from, and the arrows indicate the backbone's direction. To form a bond, you can do something like this:

Forming new bonds is sort of "safe". This means you don't have to pay too much attention to detail and it will still work. Let's bond a lone C to methane:

As you can see, the methane automatically disconnects 1 Hydrogen, to make place for our bond. This means you can create complex molecules effortlessly. Here's ethanol:

a.bond(b) is really the same as a(b), but a.bond(b) will return b, and a(b) will return a instead. There are several "convenience" methods to create molecules, like mo.alkane() and mo.alcohol():

For complex substances, you can choose not to display Hydrogens, to clear things up a bit:

You can also traverse the molecule quite easily:

So when you call .next(), it will return the next molecule, which is indicated by the arrows. Meaning, if you call .next() on the oxygen, it will return the alpha Carbon, instead of the Hydrogen:

If you really wish to get the Hydrogen, you can do something like this:

Let's create tert-butanol:

This all looks fine, however, notice the center carbon points to the CH2OH group instead. This might not be desirable, as you may want to navigate (using .next()) through the main propane chain. So you can do something slightly different:

Now, all the arrows are pointing correctly, so you can think of this molecule as "propane, with methanol attached at 2nd carbon", instead of "propanol, with methane attached at 2nd carbon".

There's also this really dope way to get molecules, and that is by just parsing it:

Experts among you might notice that "perfluoro-2,3-dimethyl-1-chloropropanol" doesn't exactly comply with IUPAC rules, so the good news is that the parser is quite lenient about this.

My parser usually can handle lots of substances, but not all of them. If there's a list of molecules that you'd wish to "just work", you can send me an email at 157239q@gmail.com. But tbh, shouldn't be that hard to construct any molecule that you want.

Simulation

Now that you know how to construct molecules, let's talk about how you can simulate and view them. You need to first construct a System:

If you were to plot it right away, it looks terrible (graph in picometers btw):

This is because the atom's position are randomly initialized. So you need to do a short simulation first:

Let's plot it:

It looks wonderful now. .simulate() method returns a list of locations:

This is quite useful if you want to see an animation of it:

Notice how you can see cyclohexane in the chair configuration (70% of the time) or the boat configuration (30% of the time). Everytime I run this it's gonna be different, but one of those 2. This sort of indicates that the simulator at least got some stuff right, and that you can rely on it to explore small molecules.

Simulation speed analysis

How big of a molecule can the simulator handle? Let's try adenosine, a relatively complex molecule:

So, 33-atom molecule, 1000 timesteps should take 250ms. Cool thing is, because the simulator is based on PyTorch, so if you have a GPU, then it can use that too:

Relatively same speeds. For a small molecule like adenosine, the performance gain isn't worth the overhead of using the GPU. Let's try attaching a bunch of adenosines together:

324-atom molecule, pretty big. This is about 15 amino acid residues btw.

So we increased our atom count by 10x, CPU times also increase 10x, but amazingly, GPU times don't increase at all. So a general rule of thumb is, use CPU if your molecule have less than 30-40 atoms, and use GPU otherwise. However realistically, the simulator is there for small molecules only. It's quite worthless for proteins.