k1lib.cli module¶
The main idea of this package is to emulate the terminal, but doing all of that inside Python itself. So this bash statement:
cat file.txt | head -5 > headerFile.txt
Turns into this statement:
cat("file.txt") | head(5) > file("headerFile.txt")
Here, “cat”, “head” and “file” are all classes extended
from BaseCli
. All of
them implements the “reverse or” operation, or __ror__.
Essentially, these 2 statements are equivalent:
3 | obj
obj.__ror__(3)
Also, a lot of these tools assume that we are operating on a table. So this table:
col1 |
col2 |
col3 |
---|---|---|
1 |
2 |
3 |
4 |
5 |
6 |
Is equivalent to this list:
[["col1", "col2", "col3"], [1, 2, 3], [4, 5, 6]]
Also, the expected way to use these tools is to import everything directly into the current environment, like this:
from k1lib.imports import *
Besides operating on string iterators alone, this package can also be extra meta, and operate on streams of strings, or streams of streams of anything. I think this is one of the most powerful concept of the cli workflow. If this interests you, check over this:
Core clis include apply
, applyS
(its
multiprocessing cousins applyMp
and applyMpBatched
are great too), op
and deref
, so start reading
there first. Then, skim over everything to know what you can do with these
collection of tools.
bio module¶
This is for functions that are actually biology-related
-
class
k1lib.cli.bio.
transcribe
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Transcribes (DNA -> RNA) incoming rows
-
class
k1lib.cli.bio.
complement
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
-
class
k1lib.cli.bio.
translate
(length: int = 0)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
class
k1lib.cli.bio.
medAa
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Converts short aa sequence to medium one
entrez module¶
This module is not really fleshed out, not that useful/elegant, and I just use
cmd
instead
mgi module¶
All tools related to the MGI database. Expected to use behind the “mgi” module name, like this:
from k1lib.imports import *
["SOD1", "AMPK"] | mgi.batch()
filt module¶
This is for functions that cuts out specific parts of the table
-
class
k1lib.cli.filt.
filt
(predicate: Callable[[T], bool], column: Optional[int] = None)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(predicate: Callable[[T], bool], column: Optional[int] = None)[source]¶ Filters out lines. Examples:
# returns [2, 6] [2, 3, 5, 6] | filt(lambda x: x%2 == 0) | deref() # returns [3, 5] [2, 3, 5, 6] | ~filt(lambda x: x%2 == 0) | deref() # returns [[2, 'a'], [6, 'c']] [[2, "a"], [3, "b"], [5, "a"], [6, "c"]] | filt(lambda x: x%2 == 0, 0) | deref()
- Parameters
column –
if integer, then predicate(row[column])
if None, then predicate(row)
-
-
k1lib.cli.filt.
isValue
(value, column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶ Filters out lines that is different from the given value. Example:
# returns [2, 2] [1, 2, 3, 2, 1] | isValue(2) | deref() # returns [1, 3, 1] [1, 2, 3, 2, 1] | ~isValue(2) | deref() # returns [[1, 2]] [[1, 2], [2, 1], [3, 4]] | isValue(2, 1) | deref()
-
k1lib.cli.filt.
isFile
() → k1lib.cli.filt.filt[source]¶ Filters out non-files. Example:
# returns ["a.py", "b.py"], if those files really do exist ["a.py", "hg/", "b.py"] | isFile()
-
k1lib.cli.filt.
inSet
(values: Set[Any], column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶ Filters out lines that is not in the specified set. Example:
# returns [2, 3] range(5) | inSet([2, 8, 3]) | deref() # returns [0, 1, 4] range(5) | ~inSet([2, 8, 3]) | deref()
-
k1lib.cli.filt.
contains
(s: str, column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶ Filters out lines that don’t contain the specified substring. Sort of similar to
grep
, but this is simpler, and can be inverted. Example:# returns ['abcd', '2bcr'] ["abcd", "0123", "2bcr"] | contains("bc") | deref()
-
class
k1lib.cli.filt.
empty
(reverse=False)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(reverse=False)[source]¶ Filters out streams that is not empty. Almost always used inverted, but “empty” is a short, sweet name easy to remember. Example:
# returns [[1, 2], ['a']] [[], [1, 2], [], ["a"]] | ~empty() | deref()
- Parameters
reverse – not intended to be used by the end user. Do
~empty()
instead.
-
-
k1lib.cli.filt.
startswith
(s: str, column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶ Filters out lines that don’t start with s. Example:
# returns ['ab', 'ac'] ["ab", "cd", "ac"] | startswith("a") | deref() # returns ['cd'] ["ab", "cd", "ac"] | ~startswith("a") | deref()
-
k1lib.cli.filt.
endswith
(s: str, column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶ Filters out lines that don’t end with s. See also:
startswith()
-
k1lib.cli.filt.
isNumeric
(column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶ Filters out a line if that column is not a number. Example:
# returns [0, 2, ‘3’] [0, 2, “3”, “a”] | isNumeric() | deref()
-
k1lib.cli.filt.
instanceOf
(cls: Union[type, Tuple[type]], column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶ Filters out lines that is not an instance of the given type. Example:
# returns [2] [2, 2.3, "a"] | instanceOf(int) | deref() # returns [2, 2.3] [2, 2.3, "a"] | instanceOf((int, float)) | deref()
-
k1lib.cli.filt.
inRange
(min: float = - inf, max: float = inf, column: Optional[int] = None) → k1lib.cli.filt.filt[source]¶ Checks whether a column is in range or not. Example:
# returns [-2, 3, 6] [-2, -8, 3, 6] | inRange(min=-3) | deref() # returns [-8] [-2, -8, 3, 6] | ~inRange(min=-3) | deref()
-
class
k1lib.cli.filt.
head
(n: int = 10)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(n: int = 10)[source]¶ Only outputs first
n
lines. You can also negate it (like~head(5)
), which then only outputs after firstn
lines. Examples:"abcde" | head(2) | deref() # returns ["a", "b"] "abcde" | ~head(2) | deref() # returns ["c", "d", "e"] "0123456" | head(-3) | deref() # returns ['0', '1', '2', '3'] "0123456" | ~head(-3) | deref() # returns ['4', '5', '6']
-
-
class
k1lib.cli.filt.
columns
(*columns: List[int])[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(*columns: List[int])[source]¶ Cuts out specific columns, sliceable. Examples:
["0123456789"] | cut(5, 8) | deref() # returns [['5', '8']] ["0123456789"] | cut(2) | deref() # returns ['2'] ["0123456789"] | cut(5, 8) | deref() # returns [['5', '8']] ["0123456789"] | ~cut()[:7:2] | deref() # returns [['1', '3', '5', '7', '8', '9']]
If you’re selecting only 1 column, then Iterator[T] will be returned, not Table[T].
-
-
k1lib.cli.filt.
cut
¶ alias of
k1lib.cli.filt.columns
-
class
k1lib.cli.filt.
rows
(*rows: List[int])[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(*rows: List[int])[source]¶ Cuts out specific rows. Space complexity O(1) as a list is not constructed (unless you’re using some really weird slices).
- Parameters
rows – ints for the row indices
Example:
"0123456789" | rows(2) | deref() # returns ["2"] "0123456789" | rows(5, 8) | deref() # returns ["5", "8"] "0123456789" | rows()[2:5] | deref() # returns ["2", "3", "4"] "0123456789" | ~rows()[2:5] | deref() # returns ["0", "1", "5", "6", "7", "8", "9"] "0123456789" | ~rows()[:7:2] | deref() # returns ['1', '3', '5', '7', '8', '9'] "0123456789" | rows()[:-4] | deref() # returns ['0', '1', '2', '3', '4', '5'] "0123456789" | ~rows()[:-4] | deref() # returns ['6', '7', '8', '9']
-
-
class
k1lib.cli.filt.
intersection
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Returns the intersection of multiple streams. Example:
# returns set([2, 4, 5]) [[1, 2, 3, 4, 5], [7, 2, 4, 6, 5]] | intersection()
-
class
k1lib.cli.filt.
union
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Returns the union of multiple streams. Example:
# returns {0, 1, 2, 10, 11, 12, 13, 14} [range(3), range(10, 15)] | union()
-
class
k1lib.cli.filt.
unique
(column: int)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(column: int)[source]¶ Filters out non-unique row elements. Example:
# returns [[1, "a"], [2, "a"]] [[1, "a"], [2, "a"], [1, "b"]] | unique(0) | deref()
- Parameters
column – doesn’t have the default case of None, because you can always use
k1lib.cli.utils.toSet
-
-
class
k1lib.cli.filt.
breakIf
(f)[source]¶ Bases:
k1lib.cli.init.BaseCli
gb module¶
All tools related to GenBank file format. Expected to use behind the “gb” module name, like this:
from k1lib.imports import *
cat("abc.gb") | gb.feats()
-
class
k1lib.cli.gb.
feats
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Fetches features, each on a separate stream
-
static
filt
(*terms: str) → k1lib.cli.init.BaseCli[source]¶ Filters for specific terms in all the features texts. If there are multiple terms, then filters for first term, then second, then third, so the term’s order might matter to you
-
static
tag
(tag: str) → k1lib.cli.init.BaseCli[source]¶ Gets a single tag out. Applies this on a single feature only
-
static
-
class
k1lib.cli.gb.
origin
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Return the origin section of the genbank file
grep module¶
-
class
k1lib.cli.grep.
grep
(pattern: str, before: int = 0, after: int = 0)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(pattern: str, before: int = 0, after: int = 0)[source]¶ Find lines that has the specified pattern. Example:
# returns ['c', 'd', '2', 'd'] "abcde12d34" | grep("d", 1) | deref() # returns ['d', 'e', 'd', '3', '4'] "abcde12d34" | grep("d", 0, 3).till("e") | deref()
- Parameters
pattern – regex pattern to search for in a line
before – lines before the hit. Outputs independent lines
after – lines after the hit. Outputs independent lines
-
-
class
k1lib.cli.grep.
grepToTable
(pattern: str, before: int = 0, after: int = 0)[source]¶ Bases:
k1lib.cli.init.BaseCli
init module¶
-
cli.
cliSettings
= {'defaultDelim': '\t', 'defaultIndent': ' ', 'lookupImgs': True, 'oboFile': None, 'strict': False}¶ Main settings of
k1lib.cli
. When using:from k1lib.cli import *
…you can just set the settings like this:
cliSettings["defaultIndent"] = "\t"
There are a few settings:
defaultDelim: default delimiter used in-between columns when creating tables
defaultIndent: default indent used for displaying nested structures
lookupImgs: whether to automatically look up images when exploring something
oboFile: gene ontology obo file location
strict: whether strict mode is on. Turning it on can help you debug stuff, but could also be a pain to work with
-
class
k1lib.cli.init.
BaseCli
(fs=[])[source]¶ Bases:
object
A base class for all the cli stuff. You can definitely create new cli tools that have the same feel without extending from this class, but advanced stream operations (like
+
,&
,.all()
,|
) won’t work.At the moment, you don’t have to call super().__init__() and super().__ror__(), as __init__’s only job right now is to solidify any
op
passed to it, and __ror__ does nothing.-
__init__
(fs=[])[source]¶ Not expected to be instantiated by the end user.
- Parameters
fs – if functions inside here is actually a
op
, then solidifies it (make it not absorb __call__ anymore)
-
__and__
(cli: k1lib.cli.init.BaseCli) → k1lib.cli.init.oneToMany[source]¶ Duplicates input stream to multiple joined clis.
-
__add__
(cli: k1lib.cli.init.BaseCli) → k1lib.cli.init.manyToManySpecific[source]¶ Parallel pass multiple streams to multiple clis.
-
all
() → k1lib.cli.init.BaseCli[source]¶ Applies this cli to all incoming streams
-
__or__
(it) → k1lib.cli.init.serial[source]¶ Joins clis end-to-end
-
-
class
k1lib.cli.init.
serial
(*clis: List[k1lib.cli.init.BaseCli])[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(*clis: List[k1lib.cli.init.BaseCli])[source]¶ Merges clis into 1, feeding end to end. Used in chaining clis together without a prime iterator. Meaning, without this, stuff like this fails to run:
[1, 2] | a() | b() # runs c = a() | b(); [1, 2] | c # doesn't run if this class doesn't exist
-
-
class
k1lib.cli.init.
oneToMany
(*clis: List[k1lib.cli.init.BaseCli])[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(*clis: List[k1lib.cli.init.BaseCli])[source]¶ Duplicates 1 stream into multiple streams, each for a cli in the list. Used in the “a & b” joining operator
-
-
class
k1lib.cli.init.
manyToMany
(cli: k1lib.cli.init.BaseCli)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(cli: k1lib.cli.init.BaseCli)[source]¶ Applies multiple streams to a single cli. Used in the “a.all()” operator. Note that this operation will use a different copy of the cli for each of the streams.
-
-
class
k1lib.cli.init.
manyToManySpecific
(*clis: List[k1lib.cli.init.BaseCli])[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(*clis: List[k1lib.cli.init.BaseCli])[source]¶ Applies multiple streams to multiple clis independently. Used in the “a + b” joining operator
-
inp module¶
This module for tools that will likely start the processing stream.
-
k1lib.cli.inp.
cat
(fileName: Optional[str] = None, text: bool = True)[source]¶ Reads a file line by line. Example:
# display first 10 lines of file cat("file.txt") | headOut() # piping in also works "file.txt" | cat() | headOut() # rename file cat("img.png", False) | file("img2.png", False)
- Parameters
fileName – if None, then return a
BaseCli
that accepts a file name and outputs Iterator[str]text – if True, read text file, else read binary file
-
k1lib.cli.inp.
curl
(url: str) → Iterator[str][source]¶ Gets file from url. File can’t be a binary blob. Example:
# prints out first 10 lines of the website curl("https://k1lib.github.io/") | headOut()
-
k1lib.cli.inp.
wget
(url: str, fileName: Optional[str] = None)[source]¶ Downloads a file
- Parameters
url – The url of the file
fileName – if None, then tries to infer it from the url
-
k1lib.cli.inp.
ls
(folder: Optional[str] = None)[source]¶ List every file and folder inside the specified folder. Example:
# returns List[str] ls("/home") # same as above "/home" | ls() # only outputs files, not folders ls("/home") | isFile()
See also:
isFile()
-
class
k1lib.cli.inp.
cmd
(cmd: str)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(cmd: str)[source]¶ Runs a command, and returns the output line by line. Example:
# return detailed list of files None | cmd("ls -la") # return list of files that ends with "ipynb" None | cmd("ls -la") | cmd('grep ipynb$')
-
property
err
¶ Error from the last command
-
-
k1lib.cli.inp.
requireCli
(cliTool: str)[source]¶ Searches for a particular cli tool (eg. “ls”), throws ImportError if not found, else do nothing
-
class
k1lib.cli.inp.
toPIL
[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
()[source]¶ Converts a path to a PIL image. Example:
ls(".") | toPIL().all() | item() # get first image
-
__ror__
(path) → PIL.Image.Image[source]¶
-
kcsv module¶
All tools related to csv file format. Expected to use behind the “kcsv” module name, like this:
from k1lib.imports import *
kcsv.cat("file.csv") | display()
kxml module¶
All tools related to xml file format. Expected to use behind the “kxml” module name, like this:
from k1lib.imports import *
cat("abc.xml") | kxml.node() | kxml.display()
-
class
k1lib.cli.kxml.
node
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Turns lines into a single node
-
__ror__
(it: Iterator[str]) → Iterator[xml.etree.ElementTree.Element][source]¶
-
-
class
k1lib.cli.kxml.
maxDepth
(depth: Optional[int] = None, copy: bool = True)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(depth: Optional[int] = None, copy: bool = True)[source]¶ Filters out too deep nodes
- Parameters
depth – max depth to include in
copy – whether to limit the nodes itself, or limit a copy
-
__ror__
(nodes: Iterator[xml.etree.ElementTree.Element]) → Iterator[xml.etree.ElementTree.Element][source]¶
-
-
class
k1lib.cli.kxml.
tag
(tag: str)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(tag: str)[source]¶ Finds all tags that have a particular name. If found, then don’t search deeper
-
__ror__
(nodes: Iterator[xml.etree.ElementTree.Element]) → Iterator[xml.etree.ElementTree.Element][source]¶
-
-
class
k1lib.cli.kxml.
pretty
(indent: Optional[str] = None)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__ror__
(it: Iterator[xml.etree.ElementTree.Element]) → Iterator[str][source]¶
-
modifier module¶
This is for quick modifiers, think of them as changing formats
-
class
k1lib.cli.modifier.
apply
(f: Callable[[str], str], column: Optional[int] = None)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(f: Callable[[str], str], column: Optional[int] = None)[source]¶ Applies a function f to every line. Example:
# returns [0, 1, 4, 9, 16] range(5) | apply(lambda x: x**2) | deref() # returns [[3.0, 1.0, 1.0], [3.0, 1.0, 1.0]] torch.ones(2, 3) | apply(lambda x: x+2, 0) | deref()
You can also use this as a decorator, like this:
@apply def f(x): return x**2 # returns [0, 1, 4, 9, 16] range(5) | f | deref()
- Parameters
column – if not None, then applies the function to that column only
-
-
class
k1lib.cli.modifier.
applyMp
(f: Callable[[T], T], prefetch: Optional[int] = None, timeout: float = 2, **kwargs)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(f: Callable[[T], T], prefetch: Optional[int] = None, timeout: float = 2, **kwargs)[source]¶ Like
apply
, but executef(row)
of each row in multiple processes. Example:# returns [3, 2] ["abc", "de"] | applyMp(lambda s: len(s)) | deref() # returns [5, 6, 9] range(3) | applyMp(lambda x, bias: x**2+bias, bias=5) | deref() # returns [[1, 2, 3], [1, 2, 3]], demonstrating outside vars work someList = [1, 2, 3] ["abc", "de"] | applyMp(lambda s: someList) | deref()
Internally, this will continuously spawn new jobs up until 80% of all CPU cores are utilized. On posix systems, the default multiprocessing start method is
fork()
. This sort of means that all the variables in memory will be copied over. This might be expensive (might also not, with copy-on-write), so you might have to think about that. On windows and macos, the default start method isspawn
, meaning each child process is a completely new interpreter, so you have to pass in all required variables and reimport every dependencies. Read more at https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methodsIf you don’t wish to schedule all jobs at once, you can specify a
prefetch
amount, and it will only schedule that much jobs ahead of time. Example:range(10000) | applyMp(lambda x: x**2) | head() | deref() # 700ms range(10000) | applyMp(lambda x: x**2, 5) | head() | deref() # 300ms # demonstrating there're no huge penalties even if we want all results at the same time range(10000) | applyMp(lambda x: x**2) | deref() # 900ms range(10000) | applyMp(lambda x: x**2, 5) | deref() # 1000ms
The first line will schedule all jobs at once, and thus will require more RAM and compute power, even though we discard most of the results anyway (the
head
cli). The second line only schedules 5 jobs ahead of time, and thus will be extremely more efficient if you don’t need all results right away.Note
Remember that every
BaseCli
is also a function, meaning that you can do stuff like:# returns [['ab', 'ac']] [["ab", "cd", "ac"]] | applyMp(startswith("a") | deref()) | deref()
Also remember that the return result of
f
should not be a generator. That’s why in the example above, there’s aderef()
inside f.Most of the time, you’d probably want to use
applyMpBatched
instead. That cli tool has the same look and feel as this, but executesf
multiple times in a single job, instead of executingf
only 1 time per job here, so should dramatically improve performance for most workloads.One last thing. Remember to close all pools (using
clearPools()
) before exiting the script so that all child processes are terminated, and that resources are freed. Let’s say if you use CUDA tensors, but have not close all pools yet, then it is possible that CUDA memory is not freed. I learned this the hard way. I’ve tried to useatexit
to close pools automatically, but it doesn’t seem to work with notebooks.- Parameters
prefetch – if not specified, schedules all jobs at the same time. If specified, schedules jobs so that there’ll only be a specified amount of jobs, and will only schedule more if results are actually being used.
timeout – seconds to wait for job before raising an error
kwargs – extra arguments to be passed to the function.
args
not included as there’re a couple of options you can pass for this cli.
-
-
k1lib.cli.modifier.
applyMpBatched
(f, bs=32, prefetch=2, timeout=5)[source]¶ Pretty much the same as
applyMp
and has the same feel to it too. Iterator[A] goes in, Iterator[B] goes out, and you specify f(A) -> B. However, this will launch jobs that will execute multiple f(), instead of 1 job per execution. All examples fromapplyMp
should work perfectly here.
-
class
k1lib.cli.modifier.
applyS
(f: Callable[[T], T])[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(f: Callable[[T], T])[source]¶ Like
apply
, but much simpler, just operating on the entire input object, essentially. The “S” stands for “single”. Example:# returns 5 3 | applyS(lambda x: x+2)
Like
apply
, you can also use this as a decorator like this:@applyS def f(x): return x+2 # returns 5 3 | f
This also decorates the returned object so that it has same qualname, docstring and whatnot.
-
-
k1lib.cli.modifier.
replace
(s: str, target: Optional[str] = None, column: Optional[int] = None)[source]¶ Replaces substring s with target for each line. Example:
# returns ['104', 'ab0c'] ["1234", "ab23c"] | replace("23", "0") | deref()
- Parameters
target – if not specified, then use the default delimiter specified in
cliSettings
-
k1lib.cli.modifier.
remove
(s: str, column: Optional[int] = None)[source]¶ Removes a specific substring in each line.
-
k1lib.cli.modifier.
toFloat
(*columns: List[int], force=False)[source]¶ Converts every row into a float. Example:
# returns [1, 3, -2.3] ["1", "3", "-2.3"] | toFloat() | deref() # returns [[1.0, 'a'], [2.3, 'b'], [8.0, 'c']] [["1", "a"], ["2.3", "b"], [8, "c"]] | toFloat(0) | deref()
With weird rows:
# returns [[1.0, 'a'], [8.0, 'c']] [["1", "a"], ["c", "b"], [8, "c"]] | toFloat(0) | deref() # returns [[1.0, 'a'], [0.0, 'b'], [8.0, 'c']] [["1", "a"], ["c", "b"], [8, "c"]] | toFloat(0, force=True) | deref()
- Parameters
columns – if nothing, then will convert each row. If available, then convert all the specified columns
force – if True, forces weird values to 0.0, else filters out all weird rows
-
k1lib.cli.modifier.
toInt
(*columns: List[int], force=False)[source]¶ Converts every row into an integer. Example:
# returns [1, 3, -2] ["1", "3", "-2.3"] | toInt() | deref()
- Parameters
columns – if nothing, then will convert each row. If available, then convert all the specified columns
force – if True, forces weird values to 0, else filters out all weird rows
See also:
toFloat()
-
class
k1lib.cli.modifier.
sort
(column: int = 0, numeric=True, reverse=False)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(column: int = 0, numeric=True, reverse=False)[source]¶ Sorts all lines based on a specific column. Example:
# returns [[5, 'a'], [1, 'b']] [[1, "b"], [5, "a"]] | ~sort(0) | deref() # returns [[2, 3]] [[1, "b"], [5, "a"], [2, 3]] | ~sort(1) | deref() # errors out, as you can't really compare str with int [[1, "b"], [2, 3], [5, "a"]] | sort(1, False) | deref()
- Parameters
column – if None, sort rows based on themselves and not an element
numeric – whether to convert column to float
reverse – False for smaller to bigger, True for bigger to smaller. Use
__invert__()
to quickly reverse the order instead of using this param
-
-
class
k1lib.cli.modifier.
sortF
(f: Callable[[T], float], reverse=False)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(f: Callable[[T], float], reverse=False)[source]¶ Sorts rows using a function. Example:
# returns ['a', 'aa', 'aaa', 'aaaa', 'aaaaa'] ["a", "aaa", "aaaaa", "aa", "aaaa"] | sortF(lambda r: len(r)) | deref() # returns ['aaaaa', 'aaaa', 'aaa', 'aa', 'a'] ["a", "aaa", "aaaaa", "aa", "aaaa"] | ~sortF(lambda r: len(r)) | deref()
-
__invert__
() → k1lib.cli.modifier.sortF[source]¶
-
-
class
k1lib.cli.modifier.
consume
(f: Union[k1lib.cli.init.BaseCli, Callable[[T], None]])[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(f: Union[k1lib.cli.init.BaseCli, Callable[[T], None]])[source]¶ Consumes the iterator in a side stream. Returns the iterator. Kinda like the bash command
tee
. Example:# prints "0\n1\n2" and returns [0, 1, 2] range(3) | consume(headOut()) | toList() # prints "range(0, 3)" and returns [0, 1, 2] range(3) | consume(lambda it: print(it)) | toList()
This is useful whenever you want to mutate something, but don’t want to include the function result into the main stream.
-
-
class
k1lib.cli.modifier.
randomize
(bs=100)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(bs=100)[source]¶ Randomize input stream. In order to be efficient, this does not convert the input iterator to a giant list and yield random values from that. Instead, this fetches
bs
items at a time, randomizes them, returns and fetch anotherbs
items. If you want to do the giant list, then just pass infloat("inf")
, orNone
. Example:# returns [0, 1, 2, 3, 4], effectively no randomize at all range(5) | randomize(1) | deref() # returns something like this: [1, 0, 2, 3, 5, 4, 6, 8, 7, 9]. You can clearly see the batches range(10) | randomize(3) | deref() # returns something like this: [7, 0, 5, 2, 4, 9, 6, 3, 1, 8] range(10) | randomize(float("inf")) | deref() # same as above range(10) | randomize(None) | deref()
-
-
class
k1lib.cli.modifier.
stagger
(every: int)[source]¶ Bases:
k1lib.cli.init.BaseCli
Staggers input stream into multiple stream “windows” placed serially. Best explained with an example:
o = range(10) | stagger(3) o | deref() # returns [0, 1, 2], 1st "window" o | deref() # returns [3, 4, 5], 2nd "window" o | deref() # returns [6, 7, 8] o | deref() # returns [9] o | deref() # returns []
This might be useful when you’re constructing a data loader:
dataset = [range(20), range(30, 50)] | transpose() dl = dataset | batched(3) | (transpose() | toTensor()).all() | stagger(4) for epoch in range(3): for xb, yb in dl: # looping over a window print(epoch) # then something like: model(xb)
The above code will print 6 lines. 4 of them is “0” (because we stagger every 4 batches), and xb’s shape’ will be (3,) (because we batched every 3 samples).
You should also keep in mind that this doesn’t really change the property of the stream itself. Essentially, treat these pairs of statement as being the same thing:
o = range(11, 100) # both returns 11 o | stagger(20) | item() o | item() # both returns [11, 12, ..., 20] o | head(10) | deref() o | stagger(20) | head(10) | deref()
Lastly, multiple iterators might be getting values from the same stream window, meaning:
o = range(11, 100) | stagger(10) it1 = iter(o); it2 = iter(o) next(it1) # returns 11 next(it2) # returns 12
This may or may not be desirable. Also this should be obvious, but I want to mention this in case it’s not clear to you.
-
class
k1lib.cli.modifier.
op
[source]¶ Bases:
k1lib._baseClasses.Absorber
,k1lib.cli.init.BaseCli
Absorbs operations done on it and applies it on the stream. Based on
Absorber
. Example:t = torch.tensor([[1, 2, 3], [4, 5, 6.0]]) # returns [torch.tensor([[4., 5., 6., 7., 8., 9.]])] [t] | (op() + 3).view(1, -1).all() | deref()
Basically, you can treat
op()
as the input tensor. Tbh, you can do the same thing with this:[t] | applyS(lambda t: (t+3).view(-1, 1)).all() | deref()
But that’s kinda long and may not be obvious. This can be surprisingly resilient, as you can still combine with other cli tools as usual, for example:
# returns [2, 3], demonstrating "&" operator torch.randn(2, 3) | (op().shape & identity()) | deref() | item() a = torch.tensor([[1, 2, 3], [7, 8, 9]]) # returns torch.tensor([4, 5, 6]), demonstrating "+" operator for clis and not clis (a | op() + 3 + identity() | item() == torch.tensor([4, 5, 6])).all() # returns [[3], [3]], demonstrating .all() and "|" serial chaining torch.randn(2, 3) | (op().shape.all() | deref()) # returns [[8, 18], [9, 19]], demonstrating you can treat `op()` as a regular function [range(10), range(10, 20)] | transpose() | filt(op() > 7, 0) | deref()
Performance-wise, there are some, but not a lot of degradation, so don’t worry about it:
n = 10_000_000 # takes 1.6s for i in range(n): i**2 # takes 1.8s, 1.125x worse than for loop range(n) | apply(lambda x: x**2) | ignore() # takes 2.7s, 1.7x worse than for loop range(n) | apply(op()**2) | ignore() # takes 2.7s range(n) | (op()**2).all() | ignore()
Reserved operations that are not absorbed are:
all
__ror__ (__or__ still works!)
op_solidify
output module¶
For operations that feel like the termination
-
class
k1lib.cli.output.
stdout
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Prints out all lines. If not iterable, then print out the input raw
-
class
k1lib.cli.output.
file
(fileName: str, text: bool = True)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
class
k1lib.cli.output.
pretty
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Pretty prints a table
-
class
k1lib.cli.output.
intercept
(raiseError: bool = True)[source]¶ Bases:
k1lib.cli.init.BaseCli
sam module¶
This is for functions that are .sam or .bam related
-
class
k1lib.cli.sam.
header
(long=True)[source]¶ Bases:
k1lib.cli.init.BaseCli
structural module¶
This is for functions that sort of changes the table structure in a dramatic way. They’re the core transformations
-
k1lib.cli.structural.
yieldSentinel
¶ Object that can be yielded in a stream to ignore this stream for the moment in
joinStreamsRandom
.
-
class
k1lib.cli.structural.
joinStreamsRandom
(fs=[])[source]¶ Join multiple streams randomly. If any streams runs out, then quits. If any stream yields
yieldSentinel
, then just ignores that result and continue. Could be useful in active learning. Example:# could return [0, 1, 10, 2, 11, 12, 13, ...], with max length 20, typical length 18 [range(0, 10), range(10, 20)] | joinStreamsRandom() | deref() stream2 = [[-5, yieldSentinel, -4, -3], yieldSentinel | repeat()] | joinStreams() # could return [-5, -4, 0, -3, 1, 2, 3, 4, 5, 6], demonstrating yieldSentinel [range(7), stream2] | joinStreamsRandom() | deref()
-
class
k1lib.cli.structural.
transpose
(fillValue=None)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(fillValue=None)[source]¶ Join multiple columns and loop through all rows. Aka transpose.
- Parameters
fillValue – if not None, then will try to zip longest with this fill value
Example:
# returns [[1, 4], [2, 5], [3, 6]] [[1, 2, 3], [4, 5, 6]] | transpose() | deref() # returns [[1, 4], [2, 5], [3, 6], [0, 7]] [[1, 2, 3], [4, 5, 6, 7]] | transpose(0) | deref()
-
-
class
k1lib.cli.structural.
joinList
(element=None, begin=True)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(element=None, begin=True)[source]¶ Join element into list.
- Parameters
element – the element to insert. If None, then takes the input [e, […]], else takes the input […] as usual
Example:
# returns [5, 2, 6, 8] [5, [2, 6, 8]] | joinList() | deref() # also returns [5, 2, 6, 8] [2, 6, 8] | joinList(5) | deref()
-
-
class
k1lib.cli.structural.
splitList
(*weights: List[float])[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(*weights: List[float])[source]¶ Splits list of elements into multiple lists. If no weights are provided, then automatically defaults to [0.8, 0.2]. Example:
# returns [[0, 1, 2, 3, 4, 5, 6, 7], [8, 9]] range(10) | splitList(0.8, 0.2) | deref() # same as the above range(10) | splitList() | deref()
-
-
class
k1lib.cli.structural.
joinStreams
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Joins multiple streams. Example:
# returns [1, 2, 3, 4, 5] [[1, 2, 3], [4, 5]] | joinStreams() | deref()
-
class
k1lib.cli.structural.
activeSamples
(limit: int = inf)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(limit: int = inf)[source]¶ Yields active learning samples. Example:
o = activeSamples() ds = range(10) # normal dataset ds = [o, ds] | joinStreamsRandom() # dataset with active learning capability next(ds) # returns 0 next(ds) # returns 1 next(ds) # returns 2 o.append(20) next(ds) # can return 3 or 20 next(ds) # can return (4 or 20) or 4
So the point of this is to be a generator of samples. You can define your dataset as a mix of active learning samples and standard samples. Whenever there’s a data point that you want to focus on, you can add it to
o
and it will eventially yield it.- Parameters
limit – max number of active samples. Discards samples if number of samples is over this.
-
-
class
k1lib.cli.structural.
batched
(bs=32, includeLast=False)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(bs=32, includeLast=False)[source]¶ Batches the input stream. Example:
# returns [[0, 1, 2], [3, 4, 5], [6, 7, 8]] range(11) | batched(3) | deref() # returns [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10]] range(11) | batched(3, True) | deref() # returns [[0, 1, 2, 3, 4]] range(5) | batched(float("inf"), True) | deref() # returns [] range(5) | batched(float("inf"), False) | deref()
-
-
k1lib.cli.structural.
collate
()[source]¶ Puts individual columns into a tensor. Example:
# returns [tensor([ 0, 10, 20]), tensor([ 1, 11, 21]), tensor([ 2, 12, 22])] [range(0, 3), range(10, 13), range(20, 23)] | collate() | toList()
-
k1lib.cli.structural.
insertRow
(*row: List[T])[source]¶ Inserts a row right before every other rows. See also:
joinList()
.
-
k1lib.cli.structural.
insertColumn
(*column, begin=True, fillValue='')[source]¶ Inserts a column at beginning or end. Example:
# returns [['a', 1, 2], ['b', 3, 4]] [[1, 2], [3, 4]] | insertColumn("a", "b") | deref()
-
k1lib.cli.structural.
insertIdColumn
(table=False, begin=True, fillValue='')[source]¶ Inserts an id column at the beginning (or end). Example:
# returns [[0, 'a', 2], [1, 'b', 4]] [["a", 2], ["b", 4]] | insertIdColumn(True) | deref() # returns [[0, 'a'], [1, 'b']] "ab" | insertIdColumn()
- Parameters
table – if False, then insert column to an Iterator[str], else treat input as a full fledged table
-
class
k1lib.cli.structural.
toDict
[source]¶ Bases:
k1lib.cli.init.BaseCli
-
class
k1lib.cli.structural.
toDictF
(keyF: Optional[Callable[[Any], str]] = None, valueF: Optional[Callable[[Any], Any]] = None)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(keyF: Optional[Callable[[Any], str]] = None, valueF: Optional[Callable[[Any], Any]] = None)[source]¶ Transform an incoming stream into a dict using a function for values. Example:
names = ["wanda", "vision", "loki", "mobius"] names | toDictF(valueF=lambda s: len(s)) # will return {"wanda": 5, "vision": 6, ...} names | toDictF(lambda s: s.title(), lambda s: len(s)) # will return {"Wanda": 5, "Vision": 6, ...}
-
-
class
k1lib.cli.structural.
expandE
(f: Callable[[T], List[T]], column: int)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
k1lib.cli.structural.
unsqueeze
(dim: int = 0)[source]¶ Unsqueeze input iterator. Example:
t = [[1, 2], [3, 4], [5, 6]] # returns torch.Size([3, 2]) torch.tensor(t).shape # returns torch.Size([1, 3, 2]) torch.tensor(t | unsqueeze(0) | deref()).shape # returns torch.Size([3, 1, 2]) torch.tensor(t | unsqueeze(1) | deref()).shape # returns torch.Size([3, 2, 1]) torch.tensor(t | unsqueeze(2) | deref()).shape
-
class
k1lib.cli.structural.
count
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Finds unique elements and returns a table with [frequency, value, percent] columns. Example:
# returns [[1, 'a', '33%'], [2, 'b', '67%']] ['a', 'b', 'b'] | count() | deref()
-
class
k1lib.cli.structural.
permute
(*permutations: List[int])[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(*permutations: List[int])[source]¶ Permutes the columns. Acts kinda like
torch.Tensor.permute()
. Example:# returns [['b', 'a'], ['d', 'c']] ["ab", "cd"] | permute(1, 0) | deref()
-
-
class
k1lib.cli.structural.
accumulate
(columnIdx: int = 0, avg=False)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
class
k1lib.cli.structural.
AA_
(*idxs: List[int], wraps=False)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(*idxs: List[int], wraps=False)[source]¶ Returns 2 streams, one that has the selected element, and the other the rest. Example:
# returns [5, [1, 6, 3, 7]] [1, 5, 6, 3, 7] | AA_(1) # returns [[5, [1, 6, 3, 7]]] [1, 5, 6, 3, 7] | AA_(1, wraps=True)
You can also put multiple indexes through:
# returns [[1, [5, 6]], [6, [1, 5]]] [1, 5, 6] | AA_(0, 2)
If you don’t specify anything, then all indexes will be sliced:
# returns [[1, [5, 6]], [5, [1, 6]], [6, [1, 5]]] [1, 5, 6] | AA_()
As for why the strange name, think of this operation as “AĀ”. In statistics, say you have a set “A”, then “not A” is commonly written as A with an overline “Ā”. So “AA_” represents “AĀ”, and that it first returns the selection A.
- Parameters
wraps – if True, then the first example will return [[5, [1, 6, 3, 7]]] instead, so that A has the same signature as Ā
-
-
class
k1lib.cli.structural.
peek
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Returns (firstRow, iterator). This sort of peaks at the first row, to potentially gain some insights about the internal formats. The returned iterator is not tampered. Example:
e, it = iter([[1, 2, 3], [1, 2]]) | peek() print(e) # prints "[1, 2, 3]" s = 0 for e in it: s += len(e) print(s) # prints "5", or length of 2 lists
You kinda have to be careful about handling the
firstRow
, because you might inadvertently alter the iterator:e, it = iter([iter(range(3)), range(4), range(2)]) | peek() e = list(e) # e is [0, 1, 2] list(next(it)) # supposed to be the same as `e`, but is [] instead
The example happens because you have already consumed all elements of the first row, and thus there aren’t any left when you try to call
next(it)
.
-
class
k1lib.cli.structural.
peekF
(f: Union[k1lib.cli.init.BaseCli, Callable[[T], T]])[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(f: Union[k1lib.cli.init.BaseCli, Callable[[T], T]])[source]¶ Similar to
peek
, but will executef(row)
and return the input Iterator, which is not tampered. Example:it = lambda: iter([[1, 2, 3], [1, 2]]) # prints "[1, 2, 3]" and returns [[1, 2, 3], [1, 2]] it() | peekF(lambda x: print(x)) | deref() # prints "1\n2\n3" it() | peekF(headOut()) | deref()
-
-
class
k1lib.cli.structural.
repeat
(limit: Optional[int] = None)[source]¶ Bases:
k1lib.cli.init.BaseCli
Yields a specified amount of the passed in object. If you intend to pass in an iterator, then make a list out of it first, as second copy of iterator probably won’t work as you will have used it the first time. Example:
# returns [[1, 2, 3], [1, 2, 3], [1, 2, 3]] [1, 2, 3] | repeat(3) | toList()
- Parameters
repeat – if None, then repeats indefinitely
-
k1lib.cli.structural.
repeatF
(f, limit: Optional[int] = None)[source]¶ Yields a specified amount generated by a specified function. Example:
# returns [4, 4, 4] repeatF(lambda: 4, 3) | toList() # returns 10 repeatF(lambda: 4) | head() | shape(0)
- Parameters
limit – if None, then repeats indefinitely
See also:
repeatFrom
-
class
k1lib.cli.structural.
repeatFrom
(limit: Optional[int] = None)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(limit: Optional[int] = None)[source]¶ Yields from a list. If runs out of elements, then do it again for
limit
times. Example:# returns [1, 2, 3, 1, 2] [1, 2, 3] | repeatFrom() | head(5) | deref() # returns [1, 2, 3, 1, 2, 3] [1, 2, 3] | repeatFrom(2) | deref()
- Parameters
limit – if None, then repeats indefinitely
-
utils module¶
This is for all short utilities that has the boilerplate feeling
-
class
k1lib.cli.utils.
size
(idx=None)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(idx=None)[source]¶ Returns number of rows and columns in the input. Example:
# returns (3, 2) [[2, 3], [4, 5, 6], [3]] | size() # returns 3 [[2, 3], [4, 5, 6], [3]] | size(0) # returns 2 [[2, 3], [4, 5, 6], [3]] | size(1) # returns (2, 0) [[], [2, 3]] | size() # returns (3, None) [2, 3, 5] | size() # returns 3 [2, 3, 5] | size(0)
You can also pipe in a
torch.Tensor
, and it will just return its shape:# returns torch.Size([3, 4]) torch.randn(3, 4) | size()
- Parameters
idx – if idx is None return (rows, columns). If 0 or 1, then rows or columns
-
-
k1lib.cli.utils.
shape
¶ alias of
k1lib.cli.utils.size
-
class
k1lib.cli.utils.
item
(amt: int = 1)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
class
k1lib.cli.utils.
identity
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Yields whatever the input is. Useful for multiple streams. Example:
# returns range(5) range(5) | identity()
-
class
k1lib.cli.utils.
toStr
(column: Optional[int] = None)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
class
k1lib.cli.utils.
join
(delim: Optional[str] = None)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(delim: Optional[str] = None)[source]¶ Merges all strings into 1, with delim in the middle. Basically
str.join()
. Example:# returns '2\na' [2, "a"] | join("\n")
-
-
class
k1lib.cli.utils.
toNumpy
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Converts generator to numpy array. Essentially
np.array(list(it))
-
class
k1lib.cli.utils.
toTensor
(dtype=torch.float32)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(dtype=torch.float32)[source]¶ Converts generator to
torch.Tensor
. Essentiallytorch.tensor(list(it))
.Also checks if input is a PIL Image. If yes, turn it into a
torch.Tensor
and return.
-
__ror__
(it: Iterator[float]) → torch.Tensor[source]¶
-
-
class
k1lib.cli.utils.
toList
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Converts generator to list.
list
would do the same, but this is just to maintain the style
-
class
k1lib.cli.utils.
wrapList
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Wraps inputs inside a list. There’s a more advanced cli tool built from this, which is
unsqueeze()
.
-
class
k1lib.cli.utils.
toSet
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Converts generator to set.
set
would do the same, but this is just to maintain the style
-
class
k1lib.cli.utils.
toIter
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Converts object to iterator. iter() would do the same, but this is just to maintain the style
-
class
k1lib.cli.utils.
toRange
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Returns iter(range(len(it))), effectively
-
class
k1lib.cli.utils.
equals
[source]¶ Bases:
object
Checks if all incoming columns/streams are identical
-
class
k1lib.cli.utils.
reverse
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Reverses incoming list. Example:
# returns [3, 5, 2] [2, 5, 3] | reverse() | deref()
-
class
k1lib.cli.utils.
ignore
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Just loops through everything, ignoring the output. Example:
# will just return an iterator, and not print anything [2, 3] | apply(lambda x: print(x)) # will prints "2\n3" [2, 3] | apply(lambda x: print(x)) | ignore()
-
class
k1lib.cli.utils.
toSum
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Calculates the sum of list of numbers. Can pipe in
torch.Tensor
. Example:# returns 45 range(10) | toSum()
-
class
k1lib.cli.utils.
toAvg
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Calculates average of list of numbers. Can pipe in
torch.Tensor
. Example:# returns 4.5 range(10) | toAvg() # returns nan [] | toAvg()
-
k1lib.cli.utils.
toMean
¶ alias of
k1lib.cli.utils.toAvg
-
class
k1lib.cli.utils.
toMax
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Calculates the max of a bunch of numbers. Can pipe in
torch.Tensor
. Example:# returns 6 [2, 5, 6, 1, 2] | toMax()
-
class
k1lib.cli.utils.
toMin
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Calculates the min of a bunch of numbers. Can pipe in
torch.Tensor
. Example:# returns 1 [2, 5, 6, 1, 2] | toMin()
-
class
k1lib.cli.utils.
lengths
(fs=[])[source]¶ Bases:
k1lib.cli.init.BaseCli
Returns the lengths of each row. Example:
[range(5), range(10)] | lengths() == [5, 10]
-
k1lib.cli.utils.
headerIdx
()[source]¶ Cuts out first line, put an index column next to it, and prints it out. Useful when you want to know what your column’s index is to cut it out. Also sets the context variable “header”, in case you need it later. Example:
# returns [[0, 'a'], [1, 'b'], [2, 'c']] ["abc"] | headerIdx() | deref()
-
class
k1lib.cli.utils.
deref
(ignoreTensors=True, maxDepth=inf)[source]¶ Bases:
k1lib.cli.init.BaseCli
-
__init__
(ignoreTensors=True, maxDepth=inf)[source]¶ Recursively converts any iterator into a list. Only
str
,numbers.Number
andModule
are not converted. Example:# returns something like "<range_iterator at 0x7fa8c52ca870>" iter(range(5)) # returns [0, 1, 2, 3, 4] iter(range(5)) | deref()
You can also specify a
maxDepth
:# returns something like "<list_iterator at 0x7f810cf0fdc0>" iter([range(3)]) | deref(maxDepth=0) # returns [range(3)] iter([range(3)]) | deref(maxDepth=1) # returns [[0, 1, 2]] iter([range(3)]) | deref(maxDepth=2)
- Parameters
ignoreTensors – if True, then don’t loop over
torch.Tensor
internalsmaxDepth – maximum depth to dereference. Starts at 0 for not doing anything at all
Warning
Can work well with PyTorch Tensors, but not Numpy’s array as they screw things up with the __ror__ operator, so do torch.from_numpy(…) first. Don’t worry about unnecessary copying, as numpy and torch both utilizes the buffer protocol.
-
__invert__
() → k1lib.cli.init.BaseCli[source]¶ Returns a
BaseCli
that makes everything an iterator.
-
others module¶
This is for pretty random clis that’s scattered everywhere.
-
k1lib.cli.others.
crissCross
()[source]¶ Like the monkey-patched function
torch.crissCross()
. Example:# returns another Tensor [torch.randn(3, 3), torch.randn(3)] | crissCross()
There are a couple monkey-patched clis:
-
torch.
stack
()¶ Stacks tensors together
Elsewhere in the library¶
There might still be more cli tools scattered around the library. These are pretty rare, quite dynamic and most likely a cool extra feature, not a core functionality, so not worth it/can’t mention it here. Anyway, execute this:
cli.scatteredClis()
to get a list of them.