k1lib.cli module¶
The main idea of this package is to emulate the terminal (hence “cli”, or “command line interface”), but doing all of that inside Python itself. So this bash statement:
cat file.txt | head -5 > headerFile.txt
Turns into this statement:
cat("file.txt") | head(5) > file("headerFile.txt")
Here, “cat”, “head” and “file” are all classes extended
from BaseCli. All of
them implements the “reverse or” operation, or __ror__.
Essentially, these 2 statements are equivalent:
3 | obj
obj.__ror__(3)
Also, a lot of these tools assume that we are operating on a table. So this table:
| col1 | col2 | col3 | 
|---|---|---|
| 1 | 2 | 3 | 
| 4 | 5 | 6 | 
Is equivalent to this list:
[["col1", "col2", "col3"], [1, 2, 3], [4, 5, 6]]
Also, the expected way to use these tools is to import everything directly into the current environment, like this:
from k1lib.imports import *
Because there are a lot of clis, you may sometimes unintentionally overwritten an
exposed cli tool. No oworries, every tool is also under the cli object, meaning
you can use deref() or cli.deref().
Besides operating on string iterators alone, this package can also be extra meta, and operate on streams of strings, or streams of streams of anything. I think this is one of the most powerful concept of the cli workflow. If this interests you, check over this:
Core clis include apply, applyS (its
multiprocessing cousins applyMp and applyMpBatched
are great too), op, filt and deref,
so start reading there first. Then, skim over everything to know what you can do with
these collection of tools. While you’re doing that, checkout trace(),
for a quite powerful debugging tool.
bio module¶
This is for functions that are actually biology-related
- 
class k1lib.cli.bio.transcribe(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Transcribes (DNA -> RNA) incoming rows 
- 
class k1lib.cli.bio.complement(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli
- 
class k1lib.cli.bio.translate(length: int = 0)[source]¶
- Bases: - k1lib.cli.init.BaseCli
- 
class k1lib.cli.bio.medAa(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Converts short aa sequence to medium one 
entrez module¶
This module is not really fleshed out, not that useful/elegant, and I just use
cmd instead
mgi module¶
All tools related to the MGI database. Expected to use behind the “mgi” module name, like this:
from k1lib.imports import *
["SOD1", "AMPK"] | mgi.batch()
filt module¶
This is for functions that cuts out specific parts of the table
- 
class k1lib.cli.filt.filt(predicate: Callable[[T], bool], column: Optional[int] = None)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(predicate: Callable[[T], bool], column: Optional[int] = None)[source]¶
- Filters out lines. Examples: - # returns [2, 6] [2, 3, 5, 6] | filt(lambda x: x%2 == 0) | deref() # returns [3, 5] [2, 3, 5, 6] | ~filt(lambda x: x%2 == 0) | deref() # returns [[2, 'a'], [6, 'c']] [[2, "a"], [3, "b"], [5, "a"], [6, "c"]] | filt(lambda x: x%2 == 0, 0) | deref() - You can also pass in - op, for extra intuitiveness:- # returns [2, 6] [2, 3, 5, 6] | filt(op() % 2 == 0) | deref() # returns ['abc', 'a12'] ["abc", "def", "a12"] | filt(op().startswith("a")) | deref() - Parameters
- column – - if integer, then predicate(row[column]) 
- if None, then predicate(row) 
 
 
 
- 
- 
k1lib.cli.filt.isFile() → k1lib.cli.filt.filt[source]¶
- Filters out non-files. Example: - # returns ["a.py", "b.py"], if those files really do exist ["a.py", "hg/", "b.py"] | isFile() 
- 
k1lib.cli.filt.inSet(values: Set[Any], column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶
- Filters out lines that is not in the specified set. Example: - # returns [2, 3] range(5) | inSet([2, 8, 3]) | deref() # returns [0, 1, 4] range(5) | ~inSet([2, 8, 3]) | deref() 
- 
k1lib.cli.filt.contains(s: str, column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶
- Filters out lines that don’t contain the specified substring. Sort of similar to - grep, but this is simpler, and can be inverted. Example:- # returns ['abcd', '2bcr'] ["abcd", "0123", "2bcr"] | contains("bc") | deref() 
- 
class k1lib.cli.filt.empty(reverse=False)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(reverse=False)[source]¶
- Filters out streams that is not empty. Almost always used inverted, but “empty” is a short, sweet name easy to remember. Example: - # returns [[1, 2], ['a']] [[], [1, 2], [], ["a"]] | ~empty() | deref() - Parameters
- reverse – not intended to be used by the end user. Do - ~empty()instead.
 
 
- 
- 
k1lib.cli.filt.isNumeric(column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶
- Filters out a line if that column is not a number. Example: - # returns [0, 2, ‘3’] [0, 2, “3”, “a”] | isNumeric() | deref() 
- 
k1lib.cli.filt.instanceOf(cls: Union[type, Tuple[type]], column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶
- Filters out lines that is not an instance of the given type. Example: - # returns [2] [2, 2.3, "a"] | instanceOf(int) | deref() # returns [2, 2.3] [2, 2.3, "a"] | instanceOf((int, float)) | deref() 
- 
k1lib.cli.filt.inRange(min: float = - inf, max: float = inf, column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶
- Checks whether a column is in range or not. Example: - # returns [-2, 3, 6] [-2, -8, 3, 6] | inRange(min=-3) | deref() # returns [-8] [-2, -8, 3, 6] | ~inRange(min=-3) | deref() 
- 
class k1lib.cli.filt.head(n: int = 10)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(n: int = 10)[source]¶
- Only outputs first - nlines. You can also negate it (like- ~head(5)), which then only outputs after first- nlines. Examples:- "abcde" | head(2) | deref() # returns ["a", "b"] "abcde" | ~head(2) | deref() # returns ["c", "d", "e"] "0123456" | head(-3) | deref() # returns ['0', '1', '2', '3'] "0123456" | ~head(-3) | deref() # returns ['4', '5', '6'] 
 
- 
- 
class k1lib.cli.filt.columns(*columns: List[int])[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(*columns: List[int])[source]¶
- Cuts out specific columns, sliceable. Examples: - ["0123456789"] | cut(5, 8) | deref() # returns [['5', '8']] ["0123456789"] | cut(2) | deref() # returns ['2'] ["0123456789"] | cut(5, 8) | deref() # returns [['5', '8']] ["0123456789"] | ~cut()[:7:2] | deref() # returns [['1', '3', '5', '7', '8', '9']] - If you’re selecting only 1 column, then Iterator[T] will be returned, not Table[T]. 
 
- 
- 
k1lib.cli.filt.cut¶
- alias of - k1lib.cli.filt.columns
- 
class k1lib.cli.filt.rows(*rows: List[int])[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(*rows: List[int])[source]¶
- Cuts out specific rows. Space complexity O(1) as a list is not constructed (unless you’re using some really weird slices). - Parameters
- rows – ints for the row indices 
 - Example: - "0123456789" | rows(2) | deref() # returns ["2"] "0123456789" | rows(5, 8) | deref() # returns ["5", "8"] "0123456789" | rows()[2:5] | deref() # returns ["2", "3", "4"] "0123456789" | ~rows()[2:5] | deref() # returns ["0", "1", "5", "6", "7", "8", "9"] "0123456789" | ~rows()[:7:2] | deref() # returns ['1', '3', '5', '7', '8', '9'] "0123456789" | rows()[:-4] | deref() # returns ['0', '1', '2', '3', '4', '5'] "0123456789" | ~rows()[:-4] | deref() # returns ['6', '7', '8', '9'] 
 
- 
- 
class k1lib.cli.filt.intersection(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Returns the intersection of multiple streams. Example: - # returns set([2, 4, 5]) [[1, 2, 3, 4, 5], [7, 2, 4, 6, 5]] | intersection() 
- 
class k1lib.cli.filt.union(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Returns the union of multiple streams. Example: - # returns {0, 1, 2, 10, 11, 12, 13, 14} [range(3), range(10, 15)] | union() 
- 
class k1lib.cli.filt.unique(column: int)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(column: int)[source]¶
- Filters out non-unique row elements. Example: - # returns [[1, "a"], [2, "a"]] [[1, "a"], [2, "a"], [1, "b"]] | unique(0) | deref() - Parameters
- column – doesn’t have the default case of None, because you can always use - k1lib.cli.utils.toSet
 
 
- 
- 
class k1lib.cli.filt.breakIf(f)[source]¶
- Bases: - k1lib.cli.init.BaseCli
gb module¶
All tools related to GenBank file format. Expected to use behind the “gb” module name, like this:
from k1lib.imports import *
cat("abc.gb") | gb.feats()
- 
class k1lib.cli.gb.feats(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Fetches features, each on a separate stream - 
static filt(*terms: str) → k1lib.cli.init.BaseCli[source]¶
- Filters for specific terms in all the features texts. If there are multiple terms, then filters for first term, then second, then third, so the term’s order might matter to you 
 - 
static tag(tag: str) → k1lib.cli.init.BaseCli[source]¶
- Gets a single tag out. Applies this on a single feature only 
 
- 
static 
- 
class k1lib.cli.gb.origin(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Return the origin section of the genbank file 
grep module¶
- 
class k1lib.cli.grep.grep(pattern: str, before: int = 0, after: int = 0)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(pattern: str, before: int = 0, after: int = 0)[source]¶
- Find lines that has the specified pattern. Example: - # returns ['c', 'd', '2', 'd'] "abcde12d34" | grep("d", 1) | deref() # returns ['d', 'e', 'd', '3', '4'] "abcde12d34" | grep("d", 0, 3).till("e") | deref() - Parameters
- pattern – regex pattern to search for in a line 
- before – lines before the hit. Outputs independent lines 
- after – lines after the hit. Outputs independent lines 
 
 
 
- 
- 
class k1lib.cli.grep.grepToTable(pattern: str, before: int = 0, after: int = 0)[source]¶
- Bases: - k1lib.cli.init.BaseCli
init module¶
- 
cli.cliSettings= {'context': <function settingsContext>, 'defaultDelim': '\t', 'defaultIndent': ' ', 'inf': inf, 'lookupImgs': True, 'oboFile': None, 'strict': False, 'svgScale': 0.7}¶
- Main settings of - k1lib.cli. When using:- from k1lib.cli import * - …you can just set the settings like this: - cliSettings["defaultIndent"] = "\t" 
There are a few settings:
defaultDelim: default delimiter used in-between columns when creating tables
defaultIndent: default indent used for displaying nested structures
lookupImgs: whether to automatically look up images when exploring something
oboFile: gene ontology obo file location
strict: whether strict mode is on. Turning it on can help you debug stuff, but could also be a pain to work with
svgScale: default svg scales for clis that displays graphviz graphs
inf: infinity definition for many clis. Defaulted to just
float("inf"). Here because you might want to temporarily not loop things infinitely.
context: context manager to preserve old settings value. Example:
with cliSettings["context"](): cliSettings["inf"] = 21 # old settings automatically restored
- 
class k1lib.cli.init.BaseCli(fs=[])[source]¶
- Bases: - object- A base class for all the cli stuff. You can definitely create new cli tools that have the same feel without extending from this class, but advanced stream operations (like - +,- &,- .all(),- |) won’t work.- At the moment, you don’t have to call super().__init__() and super().__ror__(), as __init__’s only job right now is to solidify any - oppassed to it, and __ror__ does nothing.- 
__init__(fs=[])[source]¶
- Not expected to be instantiated by the end user. - Parameters
- fs – if functions inside here is actually a - op, then solidifies it (make it not absorb __call__ anymore)
 
 - 
__and__(cli: k1lib.cli.init.BaseCli) → k1lib.cli.init.oneToMany[source]¶
- Duplicates input stream to multiple joined clis. Example: - # returns [[5], [0, 1, 2, 3, 4]] range(5) | (shape() & iden()) | deref() - Kinda like - apply. There’re just multiple ways of doing this. This I think, is more intuitive, and- applyis more for lambdas and columns mode. Performances are pretty much identical.
 - 
__add__(cli: k1lib.cli.init.BaseCli) → k1lib.cli.init.manyToManySpecific[source]¶
- Parallel pass multiple streams to multiple clis. 
 - 
all(n: int = 1) → k1lib.cli.init.BaseCli[source]¶
- Applies this cli to all incoming streams. - Parameters
- n – how many times should I chain - .all()?
 
 - 
__or__(cli) → k1lib.cli.init.serial[source]¶
- Joins clis end-to-end 
 
- 
- 
class k1lib.cli.init.serial(*clis: List[k1lib.cli.init.BaseCli])[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(*clis: List[k1lib.cli.init.BaseCli])[source]¶
- Merges clis into 1, feeding end to end. Used in chaining clis together without a prime iterator. Meaning, without this, stuff like this fails to run: - [1, 2] | a() | b() # runs c = a() | b(); [1, 2] | c # doesn't run if this class doesn't exist 
 
- 
- 
class k1lib.cli.init.oneToMany(*clis: List[k1lib.cli.init.BaseCli])[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(*clis: List[k1lib.cli.init.BaseCli])[source]¶
- Duplicates 1 stream into multiple streams, each for a cli in the list. Used in the “a & b” joining operator. See also: - BaseCli.__and__()
 
- 
- 
class k1lib.cli.init.manyToMany(cli: k1lib.cli.init.BaseCli)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(cli: k1lib.cli.init.BaseCli)[source]¶
- Applies multiple streams to a single cli. Used in the - BaseCli.all()operator.
 
- 
- 
class k1lib.cli.init.manyToManySpecific(*clis: List[k1lib.cli.init.BaseCli])[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(*clis: List[k1lib.cli.init.BaseCli])[source]¶
- Applies multiple streams to multiple clis independently. Used in the “a + b” joining operator. See also: - BaseCli.__add__()
 
- 
inp module¶
This module for tools that will likely start the processing stream.
- 
k1lib.cli.inp.cat(fileName: Optional[str] = None, text: bool = True)[source]¶
- Reads a file line by line. Example: - # display first 10 lines of file cat("file.txt") | headOut() # piping in also works "file.txt" | cat() | headOut() # rename file cat("img.png", False) | file("img2.png", False) - Parameters
- fileName – if None, then return a - BaseClithat accepts a file name and outputs Iterator[str]
- text – if True, read text file, else read binary file 
 
 
- 
k1lib.cli.inp.curl(url: str) → Iterator[str][source]¶
- Gets file from url. File can’t be a binary blob. Example: - # prints out first 10 lines of the website curl("https://k1lib.github.io/") | headOut() 
- 
k1lib.cli.inp.wget(url: str, fileName: Optional[str] = None)[source]¶
- Downloads a file - Parameters
- url – The url of the file 
- fileName – if None, then tries to infer it from the url 
 
 
- 
k1lib.cli.inp.ls(folder: Optional[str] = None)[source]¶
- List every file and folder inside the specified folder. Example: - # returns List[str] ls("/home") # same as above "/home" | ls() # only outputs files, not folders ls("/home") | isFile() - See also: - isFile()
- 
class k1lib.cli.inp.cmd(cmd: str)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(cmd: str)[source]¶
- Runs a command, and returns the output line by line. Example: - # return detailed list of files None | cmd("ls -la") # return list of files that ends with "ipynb" None | cmd("ls -la") | cmd('grep ipynb$') 
 - 
property err¶
- Error from the last command 
 
- 
- 
k1lib.cli.inp.requireCli(cliTool: str)[source]¶
- Searches for a particular cli tool (eg. “ls”), throws ImportError if not found, else do nothing 
- 
class k1lib.cli.inp.toPIL[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__()[source]¶
- Converts a path to a PIL image. Example: - ls(".") | toPIL().all() | item() # get first image 
 - 
__ror__(path) → PIL.Image.Image[source]¶
 
- 
kcsv module¶
All tools related to csv file format. Expected to use behind the “kcsv” module name, like this:
from k1lib.imports import *
kcsv.cat("file.csv") | display()
kxml module¶
All tools related to xml file format. Expected to use behind the “kxml” module name, like this:
from k1lib.imports import *
cat("abc.xml") | kxml.node() | kxml.display()
- 
class k1lib.cli.kxml.node(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Turns lines into a single node - 
__ror__(it: Iterator[str]) → Iterator[xml.etree.ElementTree.Element][source]¶
 
- 
- 
class k1lib.cli.kxml.maxDepth(depth: Optional[int] = None, copy: bool = True)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(depth: Optional[int] = None, copy: bool = True)[source]¶
- Filters out too deep nodes - Parameters
- depth – max depth to include in 
- copy – whether to limit the nodes itself, or limit a copy 
 
 
 - 
__ror__(nodes: Iterator[xml.etree.ElementTree.Element]) → Iterator[xml.etree.ElementTree.Element][source]¶
 
- 
- 
class k1lib.cli.kxml.tag(tag: str)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(tag: str)[source]¶
- Finds all tags that have a particular name. If found, then don’t search deeper 
 - 
__ror__(nodes: Iterator[xml.etree.ElementTree.Element]) → Iterator[xml.etree.ElementTree.Element][source]¶
 
- 
- 
class k1lib.cli.kxml.pretty(indent: Optional[str] = None)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__ror__(it: Iterator[xml.etree.ElementTree.Element]) → Iterator[str][source]¶
 
- 
modifier module¶
This is for quick modifiers, think of them as changing formats
- 
class k1lib.cli.modifier.apply(f: Callable[[str], str], column: Optional[int] = None)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(f: Callable[[str], str], column: Optional[int] = None)[source]¶
- Applies a function f to every line. Example: - # returns [0, 1, 4, 9, 16] range(5) | apply(lambda x: x**2) | deref() # returns [[3.0, 1.0, 1.0], [3.0, 1.0, 1.0]] torch.ones(2, 3) | apply(lambda x: x+2, 0) | deref() - You can also use this as a decorator, like this: - @apply def f(x): return x**2 # returns [0, 1, 4, 9, 16] range(5) | f | deref() - Parameters
- column – if not None, then applies the function to that column only 
 
 
- 
- 
class k1lib.cli.modifier.applyMp(f: Callable[[T], T], prefetch: Optional[int] = None, timeout: float = 2, **kwargs)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(f: Callable[[T], T], prefetch: Optional[int] = None, timeout: float = 2, **kwargs)[source]¶
- Like - apply, but execute- f(row)of each row in multiple processes. Example:- # returns [3, 2] ["abc", "de"] | applyMp(lambda s: len(s)) | deref() # returns [5, 6, 9] range(3) | applyMp(lambda x, bias: x**2+bias, bias=5) | deref() # returns [[1, 2, 3], [1, 2, 3]], demonstrating outside vars work someList = [1, 2, 3] ["abc", "de"] | applyMp(lambda s: someList) | deref() - Internally, this will continuously spawn new jobs up until 80% of all CPU cores are utilized. On posix systems, the default multiprocessing start method is - fork(). This sort of means that all the variables in memory will be copied over. This might be expensive (might also not, with copy-on-write), so you might have to think about that. On windows and macos, the default start method is- spawn, meaning each child process is a completely new interpreter, so you have to pass in all required variables and reimport every dependencies. Read more at https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods- If you don’t wish to schedule all jobs at once, you can specify a - prefetchamount, and it will only schedule that much jobs ahead of time. Example:- range(10000) | applyMp(lambda x: x**2) | head() | deref() # 700ms range(10000) | applyMp(lambda x: x**2, 5) | head() | deref() # 300ms # demonstrating there're no huge penalties even if we want all results at the same time range(10000) | applyMp(lambda x: x**2) | deref() # 900ms range(10000) | applyMp(lambda x: x**2, 5) | deref() # 1000ms - The first line will schedule all jobs at once, and thus will require more RAM and compute power, even though we discard most of the results anyway (the - headcli). The second line only schedules 5 jobs ahead of time, and thus will be extremely more efficient if you don’t need all results right away.- Note - Remember that every - BaseCliis also a function, meaning that you can do stuff like:- # returns [['ab', 'ac']] [["ab", "cd", "ac"]] | applyMp(filt(op().startswith("a")) | deref()) | deref() - Also remember that the return result of - fshould not be a generator. That’s why in the example above, there’s a- deref()inside f.- Most of the time, you’d probably want to use - applyMpBatchedinstead. That cli tool has the same look and feel as this, but executes- fmultiple times in a single job, instead of executing- fonly 1 time per job here, so should dramatically improve performance for most workloads.- One last thing. Remember to close all pools (using - clearPools()) before exiting the script so that all child processes are terminated, and that resources are freed. Let’s say if you use CUDA tensors, but have not close all pools yet, then it is possible that CUDA memory is not freed. I learned this the hard way. I’ve tried to use- atexitto close pools automatically, but it doesn’t seem to work with notebooks.- Parameters
- prefetch – if not specified, schedules all jobs at the same time. If specified, schedules jobs so that there’ll only be a specified amount of jobs, and will only schedule more if results are actually being used. 
- timeout – seconds to wait for job before raising an error 
- kwargs – extra arguments to be passed to the function. - argsnot included as there’re a couple of options you can pass for this cli.
 
 
 
- 
- 
k1lib.cli.modifier.applyMpBatched(f, bs=32, prefetch=2, timeout=5, **kwargs)[source]¶
- Pretty much the same as - applyMpand has the same feel to it too. Iterator[A] goes in, Iterator[B] goes out, and you specify f(A) -> B. However, this will launch jobs that will execute multiple f(), instead of 1 job per execution. All examples from- applyMpshould work perfectly here.
- 
class k1lib.cli.modifier.applyS(f: Callable[[T], T])[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(f: Callable[[T], T])[source]¶
- Like - apply, but much simpler, just operating on the entire input object, essentially. The “S” stands for “single”. Example:- # returns 5 3 | applyS(lambda x: x+2) - Like - apply, you can also use this as a decorator like this:- @applyS def f(x): return x+2 # returns 5 3 | f - This also decorates the returned object so that it has same qualname, docstring and whatnot. 
 
- 
- 
class k1lib.cli.modifier.applyCached(f, limit: int = 1000)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(f, limit: int = 1000)[source]¶
- Like - apply, but caches the results, so subsequent requests are faster. All examples from- applyshould work. Example:- # returns [0, 1, 4, 9, 16, 0, 1, 4, 9, 16] [*range(5), *range(5)] | applyCached(lambda x: x**2) | cli.deref() - I’m thinking about just adding a - cacheLimitargument to- apply, and have it integrate with everything. However, this feature doesn’t seem useful enough yet. May be in a future version.- Parameters
- limit – max cache size 
 
 
- 
- 
class k1lib.cli.modifier.applySerial(f, includeFirst=False)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(f, includeFirst=False)[source]¶
- Applies a function repeatedly. First yields input iterator - x. Then yields- f(x), then- f(f(x)), then- f(f(f(x)))and so on. Example:- # returns [4, 8, 16, 32, 64] 2 | applySerial(op()*2) | head(5) | deref() - If the result of your operation is an iterator, you might want to - derefit, like this:- rs = iter(range(8)) | applySerial(rows()[::2]) # returns [0, 2, 4, 6] next(rs) | deref() # returns []. This is because all the elements are taken by the previous deref() next(rs) | deref() rs = iter(range(8)) | applySerial(rows()[::2] | deref()) # returns [0, 2, 4, 6] next(rs) # returns [0, 4] next(rs) # returns [0] next(rs) - Parameters
- f – function to apply repeatedly 
- includeFirst – whether to include the raw input value or not 
 
 
 
- 
- 
k1lib.cli.modifier.replace(s: str, target: Optional[str] = None, column: Optional[int] = None)[source]¶
- Replaces substring s with target for each line. Example: - # returns ['104', 'ab0c'] ["1234", "ab23c"] | replace("23", "0") | deref() - Parameters
- target – if not specified, then use the default delimiter specified in - cliSettings
 
- 
k1lib.cli.modifier.remove(s: str, column: Optional[int] = None)[source]¶
- Removes a specific substring in each line. 
- 
k1lib.cli.modifier.toFloat(*columns: List[int], force=False)[source]¶
- Converts every row into a float. Example: - # returns [1, 3, -2.3] ["1", "3", "-2.3"] | toFloat() | deref() # returns [[1.0, 'a'], [2.3, 'b'], [8.0, 'c']] [["1", "a"], ["2.3", "b"], [8, "c"]] | toFloat(0) | deref() - With weird rows: - # returns [[1.0, 'a'], [8.0, 'c']] [["1", "a"], ["c", "b"], [8, "c"]] | toFloat(0) | deref() # returns [[1.0, 'a'], [0.0, 'b'], [8.0, 'c']] [["1", "a"], ["c", "b"], [8, "c"]] | toFloat(0, force=True) | deref() - Parameters
- columns – if nothing, then will convert each row. If available, then convert all the specified columns 
- force – if True, forces weird values to 0.0, else filters out all weird rows 
 
 
- 
k1lib.cli.modifier.toInt(*columns: List[int], force=False)[source]¶
- Converts every row into an integer. Example: - # returns [1, 3, -2] ["1", "3", "-2.3"] | toInt() | deref() - Parameters
- columns – if nothing, then will convert each row. If available, then convert all the specified columns 
- force – if True, forces weird values to 0, else filters out all weird rows 
 
 - See also: - toFloat()
- 
class k1lib.cli.modifier.sort(column: int = 0, numeric=True, reverse=False)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(column: int = 0, numeric=True, reverse=False)[source]¶
- Sorts all lines based on a specific column. Example: - # returns [[5, 'a'], [1, 'b']] [[1, "b"], [5, "a"]] | ~sort(0) | deref() # returns [[2, 3]] [[1, "b"], [5, "a"], [2, 3]] | ~sort(1) | deref() # errors out, as you can't really compare str with int [[1, "b"], [2, 3], [5, "a"]] | sort(1, False) | deref() - Parameters
- column – if None, sort rows based on themselves and not an element 
- numeric – whether to convert column to float 
- reverse – False for smaller to bigger, True for bigger to smaller. Use - __invert__()to quickly reverse the order instead of using this param
 
 
 
- 
- 
class k1lib.cli.modifier.sortF(f: Callable[[T], float], reverse=False)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(f: Callable[[T], float], reverse=False)[source]¶
- Sorts rows using a function. Example: - # returns ['a', 'aa', 'aaa', 'aaaa', 'aaaaa'] ["a", "aaa", "aaaaa", "aa", "aaaa"] | sortF(lambda r: len(r)) | deref() # returns ['aaaaa', 'aaaa', 'aaa', 'aa', 'a'] ["a", "aaa", "aaaaa", "aa", "aaaa"] | ~sortF(lambda r: len(r)) | deref() 
 - 
__invert__() → k1lib.cli.modifier.sortF[source]¶
 
- 
- 
class k1lib.cli.modifier.consume(f: Union[k1lib.cli.init.BaseCli, Callable[[T], None]])[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(f: Union[k1lib.cli.init.BaseCli, Callable[[T], None]])[source]¶
- Consumes the iterator in a side stream. Returns the iterator. Kinda like the bash command - tee. Example:- # prints "0\n1\n2" and returns [0, 1, 2] range(3) | consume(headOut()) | toList() # prints "range(0, 3)" and returns [0, 1, 2] range(3) | consume(lambda it: print(it)) | toList() - This is useful whenever you want to mutate something, but don’t want to include the function result into the main stream. 
 
- 
- 
class k1lib.cli.modifier.randomize(bs=100)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(bs=100)[source]¶
- Randomize input stream. In order to be efficient, this does not convert the input iterator to a giant list and yield random values from that. Instead, this fetches - bsitems at a time, randomizes them, returns and fetch another- bsitems. If you want to do the giant list, then just pass in- float("inf"), or- None. Example:- # returns [0, 1, 2, 3, 4], effectively no randomize at all range(5) | randomize(1) | deref() # returns something like this: [1, 0, 2, 3, 5, 4, 6, 8, 7, 9]. You can clearly see the batches range(10) | randomize(3) | deref() # returns something like this: [7, 0, 5, 2, 4, 9, 6, 3, 1, 8] range(10) | randomize(float("inf")) | deref() # same as above range(10) | randomize(None) | deref() 
 
- 
- 
class k1lib.cli.modifier.stagger(every: int)[source]¶
- Bases: - k1lib.cli.init.BaseCli- Staggers input stream into multiple stream “windows” placed serially. Best explained with an example: - o = range(10) | stagger(3) o | deref() # returns [0, 1, 2], 1st "window" o | deref() # returns [3, 4, 5], 2nd "window" o | deref() # returns [6, 7, 8] o | deref() # returns [9] o | deref() # returns [] - This might be useful when you’re constructing a data loader: - dataset = [range(20), range(30, 50)] | transpose() dl = dataset | batched(3) | (transpose() | toTensor()).all() | stagger(4) for epoch in range(3): for xb, yb in dl: # looping over a window print(epoch) # then something like: model(xb) - The above code will print 6 lines. 4 of them is “0” (because we stagger every 4 batches), and xb’s shape’ will be (3,) (because we batched every 3 samples). - You should also keep in mind that this doesn’t really change the property of the stream itself. Essentially, treat these pairs of statement as being the same thing: - o = range(11, 100) # both returns 11 o | stagger(20) | item() o | item() # both returns [11, 12, ..., 20] o | head(10) | deref() o | stagger(20) | head(10) | deref() - Lastly, multiple iterators might be getting values from the same stream window, meaning: - o = range(11, 100) | stagger(10) it1 = iter(o); it2 = iter(o) next(it1) # returns 11 next(it2) # returns 12 - This may or may not be desirable. Also this should be obvious, but I want to mention this in case it’s not clear to you. 
- 
class k1lib.cli.modifier.op[source]¶
- Bases: - k1lib._baseClasses.Absorber,- k1lib.cli.init.BaseCli- Absorbs operations done on it and applies it on the stream. Based on - Absorber. Example:- t = torch.tensor([[1, 2, 3], [4, 5, 6.0]]) # returns [torch.tensor([[4., 5., 6., 7., 8., 9.]])] [t] | (op() + 3).view(1, -1).all() | deref() - Basically, you can treat - op()as the input tensor. Tbh, you can do the same thing with this:- [t] | applyS(lambda t: (t+3).view(-1, 1)).all() | deref() - But that’s kinda long and may not be obvious. This can be surprisingly resilient, as you can still combine with other cli tools as usual, for example: - # returns [2, 3], demonstrating "&" operator torch.randn(2, 3) | (op().shape & identity()) | deref() | item() a = torch.tensor([[1, 2, 3], [7, 8, 9]]) # returns torch.tensor([4, 5, 6]), demonstrating "+" operator for clis and not clis (a | op() + 3 + identity() | item() == torch.tensor([4, 5, 6])).all() # returns [[3], [3]], demonstrating .all() and "|" serial chaining torch.randn(2, 3) | (op().shape.all() | deref()) # returns [[8, 18], [9, 19]], demonstrating you can treat `op()` as a regular function [range(10), range(10, 20)] | transpose() | filt(op() > 7, 0) | deref() - Performance-wise, there are some, but not a lot of degradation, so don’t worry about it. Simple operations executes pretty much on par with native lambdas: - n = 10_000_000 # takes 1.48s for i in range(n): i**2 # takes 1.89s, 1.28x worse than for loop range(n) | apply(lambda x: x**2) | ignore() # takes 1.86s, 1.26x worse than for loop range(n) | apply(op()**2) | ignore() # takes 1.86s range(n) | (op()**2).all() | ignore() - More complex operations can take more of a hit: - # takes 1.66s for i in range(n): i**2-3 # takes 2.02s, 1.22x worse than for loop range(n) | apply(lambda x: x**2-3) | ignore() # takes 2.81s, 1.69x worse than for loop range(n) | apply(op()**2-3) | ignore() - Reserved operations that are not absorbed are: - all 
- __ror__ (__or__ still works!) 
- op_solidify 
 
output module¶
For operations that feel like the termination
- 
class k1lib.cli.output.stdout(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Prints out all lines. If not iterable, then print out the input raw. Example: - # prints out "0\n1\n2" range(3) | stdout() # same as above, but (maybe?) more familiar range(3) > stdout() 
- 
class k1lib.cli.output.file(fileName: str, text: bool = True)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(fileName: str, text: bool = True)[source]¶
- Opens a new file for writing. Example: - # writes "0\n1\n2\n" to file range(3) | file("test/f.txt") # same as above, but (maybe?) more familiar range(3) > file("text/f.txt") # returns ['0', '1', '2'] cat("folder/f.txt") | deref() # writes bytes to file b'5643' | file("test/a.bin", False) # returns ['5643'] cat("test/a.bin") | deref() - Parameters
- text – if True, accepts Iterator[str], and prints out each string on a new line. Else accepts bytes and write in 1 go. 
 
 
- 
- 
class k1lib.cli.output.pretty(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Pretty prints a table. Not really used directly. Example: - # These 2 statements are pretty much the same [range(10), range(10)] | head(5) | pretty() > stdout() [range(10), range(10)] | display() 
- 
class k1lib.cli.output.intercept(raiseError: bool = True)[source]¶
- Bases: - k1lib.cli.init.BaseCli
sam module¶
This is for functions that are .sam or .bam related
- 
class k1lib.cli.sam.header(long=True)[source]¶
- Bases: - k1lib.cli.init.BaseCli
structural module¶
This is for functions that sort of changes the table structure in a dramatic way. They’re the core transformations
- 
k1lib.cli.structural.yieldSentinel¶
- Object that can be yielded in a stream to ignore this stream for the moment in - joinStreamsRandom. It will also stops- derefearly.
- 
class k1lib.cli.structural.joinStreamsRandom(fs=[])[source]¶
- Join multiple streams randomly. If any streams runs out, then quits. If any stream yields - yieldSentinel, then just ignores that result and continue. Could be useful in active learning. Example:- # could return [0, 1, 10, 2, 11, 12, 13, ...], with max length 20, typical length 18 [range(0, 10), range(10, 20)] | joinStreamsRandom() | deref() stream2 = [[-5, yieldSentinel, -4, -3], yieldSentinel | repeat()] | joinStreams() # could return [-5, -4, 0, -3, 1, 2, 3, 4, 5, 6], demonstrating yieldSentinel [range(7), stream2] | joinStreamsRandom() | deref() 
- 
class k1lib.cli.structural.transpose(fillValue=None)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(fillValue=None)[source]¶
- Join multiple columns and loop through all rows. Aka transpose. Example: - # returns [[1, 4], [2, 5], [3, 6]] [[1, 2, 3], [4, 5, 6]] | transpose() | deref() # returns [[1, 4], [2, 5], [3, 6], [0, 7]] [[1, 2, 3], [4, 5, 6, 7]] | transpose(0) | deref() - Be careful with infinite streams, as transposing stream of shape (inf, 5) will hang this operation! - Parameters
- fillValue – if not None, then will try to zip longest with this fill value 
 
 
- 
- 
class k1lib.cli.structural.joinList(element=None, begin=True)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(element=None, begin=True)[source]¶
- Join element into list. Example: - # returns [5, 2, 6, 8] [5, [2, 6, 8]] | joinList() | deref() # also returns [5, 2, 6, 8] [2, 6, 8] | joinList(5) | deref() - Parameters
- element – the element to insert. If None, then takes the input [e, […]], else takes the input […] as usual 
 
 
- 
- 
class k1lib.cli.structural.splitList(*weights: List[float])[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(*weights: List[float])[source]¶
- Splits list of elements into multiple lists. If no weights are provided, then automatically defaults to [0.8, 0.2]. Example: - # returns [[0, 1, 2, 3, 4, 5, 6, 7], [8, 9]] range(10) | splitList(0.8, 0.2) | deref() # same as the above range(10) | splitList() | deref() 
 
- 
- 
class k1lib.cli.structural.joinStreams(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Joins multiple streams. Example: - # returns [1, 2, 3, 4, 5] [[1, 2, 3], [4, 5]] | joinStreams() | deref() 
- 
class k1lib.cli.structural.activeSamples(limit: int = 100, p: float = 0.95)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(limit: int = 100, p: float = 0.95)[source]¶
- Yields active learning samples. Example: - o = activeSamples() ds = range(10) # normal dataset ds = [o, ds] | joinStreamsRandom() # dataset with active learning capability next(ds) # returns 0 next(ds) # returns 1 next(ds) # returns 2 o.append(20) next(ds) # can return 3 or 20 next(ds) # can return (4 or 20) or 4 - So the point of this is to be a generator of samples. You can define your dataset as a mix of active learning samples and standard samples. Whenever there’s a data point that you want to focus on, you can add it to - oand it will eventially yield it.- Warning - It might not be a good idea to set param - limitto higher numbers than 100. This is because, the network might still not understand a wrong sample after being shown multiple times, and will keep adding that wrong sample back in, distracting it from other samples, and reduce network’s accuracy after removing active learning from it.- If - limitis low enough (from my testing, 30-100 should be fine), then old wrong samples will be kicked out, allowing for a fresh stream of wrong samples coming in, and preventing the problem above. If you found that removing active learning makes the accuracy drops dramatically, then try decreasing the limit.- Parameters
- limit – max number of active samples. Discards samples if number of samples is over this. 
- p – probability of actually adding the samples in 
 
 
 
- 
- 
k1lib.cli.structural.table(delim: Optional[str] = None)[source]¶
- Basically - op().split(delim).all(). This exists because this is used quite a lot in bioinformatics. Example:- # returns [['a', 'bd'], ['1', '2', '3']] ["a|bd", "1|2|3"] | table("|") | deref() 
- 
class k1lib.cli.structural.batched(bs=32, includeLast=False)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(bs=32, includeLast=False)[source]¶
- Batches the input stream. Example: - # returns [[0, 1, 2], [3, 4, 5], [6, 7, 8]] range(11) | batched(3) | deref() # returns [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10]] range(11) | batched(3, True) | deref() # returns [[0, 1, 2, 3, 4]] range(5) | batched(float("inf"), True) | deref() # returns [] range(5) | batched(float("inf"), False) | deref() 
 
- 
- 
k1lib.cli.structural.collate()[source]¶
- Puts individual columns into a tensor. Example: - # returns [tensor([ 0, 10, 20]), tensor([ 1, 11, 21]), tensor([ 2, 12, 22])] [range(0, 3), range(10, 13), range(20, 23)] | collate() | toList() 
- 
k1lib.cli.structural.insertRow(*row: List[T])[source]¶
- Inserts a row right before every other rows. See also: - joinList().
- 
k1lib.cli.structural.insertColumn(*column, begin=True, fillValue='')[source]¶
- Inserts a column at beginning or end. Example: - # returns [['a', 1, 2], ['b', 3, 4]] [[1, 2], [3, 4]] | insertColumn("a", "b") | deref() 
- 
k1lib.cli.structural.insertIdColumn(table=False, begin=True, fillValue='')[source]¶
- Inserts an id column at the beginning (or end). Example: - # returns [[0, 'a', 2], [1, 'b', 4]] [["a", 2], ["b", 4]] | insertIdColumn(True) | deref() # returns [[0, 'a'], [1, 'b']] "ab" | insertIdColumn() - Parameters
- table – if False, then insert column to an Iterator[str], else treat input as a full fledged table 
 
- 
class k1lib.cli.structural.toDict[source]¶
- Bases: - k1lib.cli.init.BaseCli
- 
class k1lib.cli.structural.toDictF(keyF: Optional[Callable[[Any], str]] = None, valueF: Optional[Callable[[Any], Any]] = None)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(keyF: Optional[Callable[[Any], str]] = None, valueF: Optional[Callable[[Any], Any]] = None)[source]¶
- Transform an incoming stream into a dict using a function for values. Example: - names = ["wanda", "vision", "loki", "mobius"] names | toDictF(valueF=lambda s: len(s)) # will return {"wanda": 5, "vision": 6, ...} names | toDictF(lambda s: s.title(), lambda s: len(s)) # will return {"Wanda": 5, "Vision": 6, ...} 
 
- 
- 
class k1lib.cli.structural.expandE(f: Callable[[T], List[T]], column: int)[source]¶
- Bases: - k1lib.cli.init.BaseCli
- 
k1lib.cli.structural.unsqueeze(dim: int = 0)[source]¶
- Unsqueeze input iterator. Example: - t = [[1, 2], [3, 4], [5, 6]] # returns (3, 2) t | shape() # returns (1, 3, 2) t | unsqueeze(0) | shape() # returns (3, 1, 2) t | unsqueeze(1) | shape() # returns (3, 2, 1) t | unsqueeze(2) | shape() - Behind the scenes, it’s really just - wrapList().all(dim), but the “unsqueeze” name is a lot more familiar. Also note that the inverse operation “squeeze” is sort of- item().all(dim), if you’re sure that this is desirable:- t = [[1, 2], [3, 4], [5, 6]] # returns (3, 2) t | unsqueeze(1) | item().all(1) | shape() 
- 
class k1lib.cli.structural.count(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Finds unique elements and returns a table with [frequency, value, percent] columns. Example: - # returns [[1, 'a', '33%'], [2, 'b', '67%']] ['a', 'b', 'b'] | count() | deref() 
- 
class k1lib.cli.structural.permute(*permutations: List[int])[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(*permutations: List[int])[source]¶
- Permutes the columns. Acts kinda like - torch.Tensor.permute(). Example:- # returns [['b', 'a'], ['d', 'c']] ["ab", "cd"] | permute(1, 0) | deref() 
 
- 
- 
class k1lib.cli.structural.accumulate(columnIdx: int = 0, avg=False)[source]¶
- Bases: - k1lib.cli.init.BaseCli
- 
class k1lib.cli.structural.AA_(*idxs: List[int], wraps=False)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(*idxs: List[int], wraps=False)[source]¶
- Returns 2 streams, one that has the selected element, and the other the rest. Example: - # returns [5, [1, 6, 3, 7]] [1, 5, 6, 3, 7] | AA_(1) # returns [[5, [1, 6, 3, 7]]] [1, 5, 6, 3, 7] | AA_(1, wraps=True) - You can also put multiple indexes through: - # returns [[1, [5, 6]], [6, [1, 5]]] [1, 5, 6] | AA_(0, 2) - If you don’t specify anything, then all indexes will be sliced: - # returns [[1, [5, 6]], [5, [1, 6]], [6, [1, 5]]] [1, 5, 6] | AA_() - As for why the strange name, think of this operation as “AĀ”. In statistics, say you have a set “A”, then “not A” is commonly written as A with an overline “Ā”. So “AA_” represents “AĀ”, and that it first returns the selection A. - Parameters
- wraps – if True, then the first example will return [[5, [1, 6, 3, 7]]] instead, so that A has the same signature as Ā 
 
 
- 
- 
class k1lib.cli.structural.peek(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Returns (firstRow, iterator). This sort of peaks at the first row, to potentially gain some insights about the internal formats. The returned iterator is not tampered. Example: - e, it = iter([[1, 2, 3], [1, 2]]) | peek() print(e) # prints "[1, 2, 3]" s = 0 for e in it: s += len(e) print(s) # prints "5", or length of 2 lists - You kinda have to be careful about handling the - firstRow, because you might inadvertently alter the iterator:- e, it = iter([iter(range(3)), range(4), range(2)]) | peek() e = list(e) # e is [0, 1, 2] list(next(it)) # supposed to be the same as `e`, but is [] instead - The example happens because you have already consumed all elements of the first row, and thus there aren’t any left when you try to call - next(it).
- 
class k1lib.cli.structural.peekF(f: Union[k1lib.cli.init.BaseCli, Callable[[T], T]])[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(f: Union[k1lib.cli.init.BaseCli, Callable[[T], T]])[source]¶
- Similar to - peek, but will execute- f(row)and return the input Iterator, which is not tampered. Example:- it = lambda: iter([[1, 2, 3], [1, 2]]) # prints "[1, 2, 3]" and returns [[1, 2, 3], [1, 2]] it() | peekF(lambda x: print(x)) | deref() # prints "1\n2\n3" it() | peekF(headOut()) | deref() 
 
- 
- 
class k1lib.cli.structural.repeat(limit: Optional[int] = None)[source]¶
- Bases: - k1lib.cli.init.BaseCli- Yields a specified amount of the passed in object. If you intend to pass in an iterator, then make a list out of it first, as second copy of iterator probably won’t work as you will have used it the first time. Example: - # returns [[1, 2, 3], [1, 2, 3], [1, 2, 3]] [1, 2, 3] | repeat(3) | toList() - Parameters
- repeat – if None, then repeats indefinitely 
 
- 
k1lib.cli.structural.repeatF(f, limit: Optional[int] = None)[source]¶
- Yields a specified amount generated by a specified function. Example: - # returns [4, 4, 4] repeatF(lambda: 4, 3) | toList() # returns 10 repeatF(lambda: 4) | head() | shape(0) - Parameters
- limit – if None, then repeats indefinitely 
 - See also: - repeatFrom
- 
class k1lib.cli.structural.repeatFrom(limit: Optional[int] = None)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(limit: Optional[int] = None)[source]¶
- Yields from a list. If runs out of elements, then do it again for - limittimes. Example:- # returns [1, 2, 3, 1, 2] [1, 2, 3] | repeatFrom() | head(5) | deref() # returns [1, 2, 3, 1, 2, 3] [1, 2, 3] | repeatFrom(2) | deref() - Parameters
- limit – if None, then repeats indefinitely 
 
 
- 
trace module¶
- 
class k1lib.cli.trace.trace(f=<k1lib.cli.utils.size object>, maxDepth=inf)[source]¶
- Bases: - k1lib.cli.trace._trace- 
last= None¶
- Last instantiated trace object. Access this to view the previous (possibly nested) trace. 
 - 
__init__(f=<k1lib.cli.utils.size object>, maxDepth=inf)[source]¶
- Traces out how the data stream is transformed through complex cli tools. Example: - # returns [1, 4, 9, 16], normal command range(1, 5) | apply(lambda x: x**2) | deref() # traced command, will display how the shapes evolve through cli tools range(1, 5) | trace() | apply(lambda x: x**2) | deref() - Essentially, this - derefevery stream before and after every cli tool, and then displays the clis and streams in a graph for visualization.- There’re a lot more instructions and code examples over the tutorial section. Go check it out! 
 
- 
utils module¶
This is for all short utilities that has the boilerplate feeling
- 
class k1lib.cli.utils.size(idx=None)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(idx=None)[source]¶
- Returns number of rows and columns in the input. Example: - # returns (3, 2) [[2, 3], [4, 5, 6], [3]] | size() # returns 3 [[2, 3], [4, 5, 6], [3]] | size(0) # returns 2 [[2, 3], [4, 5, 6], [3]] | size(1) # returns (2, 0) [[], [2, 3]] | size() # returns (3,) [2, 3, 5] | size() # returns 3 [2, 3, 5] | size(0) # returns (3, 2, 2) [[[2, 1], [0, 6, 7]], 3, 5] | size() # returns (1,) and not (1, 3) ["abc"] | size() # returns (1, 2, 3) [torch.randn(2, 3)] | size() - Parameters
- idx – if idx is None return (rows, columns). If 0 or 1, then rows or columns 
 
 
- 
- 
k1lib.cli.utils.shape¶
- alias of - k1lib.cli.utils.size
- 
class k1lib.cli.utils.item(amt: int = 1)[source]¶
- Bases: - k1lib.cli.init.BaseCli
- 
class k1lib.cli.utils.identity(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Yields whatever the input is. Useful for multiple streams. Example: - # returns range(5) range(5) | identity() 
- 
k1lib.cli.utils.iden¶
- alias of - k1lib.cli.utils.identity
- 
class k1lib.cli.utils.toStr(column: Optional[int] = None)[source]¶
- Bases: - k1lib.cli.init.BaseCli
- 
class k1lib.cli.utils.join(delim: Optional[str] = None)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(delim: Optional[str] = None)[source]¶
- Merges all strings into 1, with delim in the middle. Basically - str.join(). Example:- # returns '2\na' [2, "a"] | join("\n") 
 
- 
- 
class k1lib.cli.utils.toNumpy(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Converts generator to numpy array. Essentially - np.array(list(it))
- 
class k1lib.cli.utils.toTensor(dtype=torch.float32)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(dtype=torch.float32)[source]¶
- Converts generator to - torch.Tensor. Essentially- torch.tensor(list(it)).- Also checks if input is a PIL Image. If yes, turn it into a - torch.Tensorand return.
 - 
__ror__(it: Iterator[float]) → torch.Tensor[source]¶
 
- 
- 
class k1lib.cli.utils.toList(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Converts generator to list. - listwould do the same, but this is just to maintain the style
- 
class k1lib.cli.utils.wrapList(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Wraps inputs inside a list. There’s a more advanced cli tool built from this, which is - unsqueeze().
- 
class k1lib.cli.utils.toSet(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Converts generator to set. - setwould do the same, but this is just to maintain the style
- 
class k1lib.cli.utils.toIter(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Converts object to iterator. iter() would do the same, but this is just to maintain the style 
- 
class k1lib.cli.utils.toRange(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Returns iter(range(len(it))), effectively 
- 
class k1lib.cli.utils.equals[source]¶
- Bases: - object- Checks if all incoming columns/streams are identical 
- 
class k1lib.cli.utils.reverse(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Reverses incoming list. Example: - # returns [3, 5, 2] [2, 5, 3] | reverse() | deref() 
- 
class k1lib.cli.utils.ignore(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Just loops through everything, ignoring the output. Example: - # will just return an iterator, and not print anything [2, 3] | apply(lambda x: print(x)) # will prints "2\n3" [2, 3] | apply(lambda x: print(x)) | ignore() 
- 
class k1lib.cli.utils.toSum(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Calculates the sum of list of numbers. Can pipe in - torch.Tensor. Example:- # returns 45 range(10) | toSum() 
- 
class k1lib.cli.utils.toAvg(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Calculates average of list of numbers. Can pipe in - torch.Tensor. Example:- # returns 4.5 range(10) | toAvg() # returns nan [] | toAvg() 
- 
k1lib.cli.utils.toMean¶
- alias of - k1lib.cli.utils.toAvg
- 
class k1lib.cli.utils.toMax(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Calculates the max of a bunch of numbers. Can pipe in - torch.Tensor. Example:- # returns 6 [2, 5, 6, 1, 2] | toMax() 
- 
class k1lib.cli.utils.toMin(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Calculates the min of a bunch of numbers. Can pipe in - torch.Tensor. Example:- # returns 1 [2, 5, 6, 1, 2] | toMin() 
- 
class k1lib.cli.utils.lengths(fs=[])[source]¶
- Bases: - k1lib.cli.init.BaseCli- Returns the lengths of each element. Example: - [range(5), range(10)] | lengths() == [5, 10] - This is a simpler (and faster!) version of - shape. It assumes each element can be called with- len(x), while- shapeiterates through every elements to get the length, and thus is much slower.
- 
k1lib.cli.utils.headerIdx()[source]¶
- Cuts out first line, put an index column next to it, and prints it out. Useful when you want to know what your column’s index is to cut it out. Also sets the context variable “header”, in case you need it later. Example: - # returns [[0, 'a'], [1, 'b'], [2, 'c']] ["abc"] | headerIdx() | deref() 
- 
class k1lib.cli.utils.deref(ignoreTensors=True, maxDepth=inf)[source]¶
- Bases: - k1lib.cli.init.BaseCli- 
__init__(ignoreTensors=True, maxDepth=inf)[source]¶
- Recursively converts any iterator into a list. Only - str,- numbers.Numberand- Moduleare not converted. Example:- # returns something like "<range_iterator at 0x7fa8c52ca870>" iter(range(5)) # returns [0, 1, 2, 3, 4] iter(range(5)) | deref() # returns [2, 3], yieldSentinel stops things early [2, 3, yieldSentinel, 6] | deref() - You can also specify a - maxDepth:- # returns something like "<list_iterator at 0x7f810cf0fdc0>" iter([range(3)]) | deref(maxDepth=0) # returns [range(3)] iter([range(3)]) | deref(maxDepth=1) # returns [[0, 1, 2]] iter([range(3)]) | deref(maxDepth=2) - Parameters
- ignoreTensors – if True, then don’t loop over - torch.Tensorinternals
- maxDepth – maximum depth to dereference. Starts at 0 for not doing anything at all 
 
 - Warning - Can work well with PyTorch Tensors, but not Numpy’s array as they screw things up with the __ror__ operator, so do torch.from_numpy(…) first. Don’t worry about unnecessary copying, as numpy and torch both utilizes the buffer protocol. 
 - 
__invert__() → k1lib.cli.init.BaseCli[source]¶
- Returns a - BaseClithat makes everything an iterator. Not entirely sure when this comes in handy, but it’s there.
 
- 
others module¶
This is for pretty random clis that’s scattered everywhere.
- 
k1lib.cli.others.crissCross()[source]¶
- Like the monkey-patched function - torch.crissCross(). Example:- # returns another Tensor [torch.randn(3, 3), torch.randn(3)] | crissCross() 
There are a couple monkey-patched clis:
- 
torch.stack()¶
- Stacks tensors together 
Elsewhere in the library¶
There might still be more cli tools scattered around the library. These are pretty rare, quite dynamic and most likely a cool extra feature, not a core functionality, so not worth it/can’t mention it here. Anyway, execute this:
cli.scatteredClis()
to get a list of them.