xiuminglib package¶

Subpackages¶

Submodules¶

xiuminglib.camera module¶

class xiuminglib.camera.PerspCam(name='cam', f_pix=533.33, im_res=(256, 256), loc=(1, 1, 1), lookat=(0, 0, 0), up=(0, 1, 0))[source]¶

Bases: object

Perspective camera in 35mm format.

This is not an OpenGL/Blender camera (where $+x$ points right, $+y$ up, and $-z$ into the viewing direction), but rather a “CV camera” (where $+x$ points right, $+y$ down, and $+z$ into the viewing direction). See more in ext_mat.

Because we mostly consider just the camera and the object, we assume the object coordinate system (the “local system” in Blender) aligns with (and hence, is the same as) the world coordinate system (the “global system” in Blender).

Note

Sensor width of the 35mm format is actually 36mm.
This class assumes unit pixel aspect ratio (i.e., $f_x = f_y$) and no skewing between the sensor plane and optical axis.
The active sensor size may be smaller than sensor_w and sensor_h, depending on im_res. See sensor_w_active and sensor_h_active.
aov, sensor_h, and sensor_w are hardware properties, having nothing to do with im_res.

__init__(name='cam', f_pix=533.33, im_res=(256, 256), loc=(1, 1, 1), lookat=(0, 0, 0), up=(0, 1, 0))[source]¶

Parameters:

name (str, optional) – Camera name.
f_pix (float, optional) – Focal length in pixel.
im_res (array_like, optional) – Image height and width in pixels.
loc (array_like, optional) – Camera location in object space.
lookat (array_like, optional) – Where the camera points to in object space, so default $(0, 0, 0)$ is the object center.
up (array_like, optional) – Vector in object space that, when projected, points upward in image.

aov¶

Vertical and horizontal angles of view in degrees.

Type:	numpy.ndarray

backproj(depth, fg_mask=None, bg_fill=0.0, depth_type='plane', space='object')[source]¶

Backprojects a depth map to 3D points.

Resolution of the depth map may be different from im_h and im_w: im_h and im_w decide the image coordinate bounds, and the depth resolution decides number of steps.

Parameters:	depth (numpy.ndarray) – Depth map. fg_mask (numpy.ndarray, optional) – Backproject only pixels falling inside this foreground mask. Its values should be logical. bg_fill (flaot, optional) – Filler value for background region. depth_type (str, optional) – Plane or ray depth. space (str, optional) – In which space the backprojected points are specified: `'object'` or `'camera'`.
Returns:	$xyz$ map.
Return type:	numpy.ndarray

blender_rot_euler¶

Euler rotations in degrees.

Type:	numpy.ndarray

ext_mat¶

$3\times 4$ object-to-camera extrinsics matrix, i.e., rotation and translation that transform a point from object space to camera space.

Two coordinate systems involved: object space “obj” and camera space following the computer vision convention “cv”, where $+x$ horizontally points right (to align with pixel coordinates), $+y$ vertically points down, and $+z$ is the look-at direction (because right-handed).

Type:	numpy.ndarray

ext_mat_4x4¶

Padding $[0, 0, 0, 1]$ to bottom of the $3\times 4$ extrinsics matrix to make it invertible.

Type:	numpy.ndarray

f_mm¶

35mm format-equivalent focal length in mm.

Type:	float

f_pix¶

Focal length in pixels.

Type:	float

gen_rays(spp=1)[source]¶

Generates ray directions in object space, with the ray origin being the camera location.

Parameters:	spp (int, optional) – Samples (or number of rays) per pixel. Must be a perfect square $S^2$ due to uniform, deterministic supersampling.
Returns:	An $H\times W\times S^2\times 3$ array of ray directions.
Return type:	numpy.ndarray

get_cam2obj(cam_type='cv', square=False)[source]¶

Inverse of get_obj2cam().

One example use: calling this with cam_type='blender' gives Blender’s cam.matrix_world.

get_obj2cam(cam_type='cv', square=False)[source]¶

Gets the object-to-camera transformation matrix.

Parameters:	cam_type (str, optional) – Accepted are `'cv'`/`'opencv'` and `'opengl'`/`'blender'`. square (bool, optional) – If true, the last row of $[0, 0, 0, 1]$ is kept, which makes the matrix invertible.
Returns:	$3\times 4$ or $4\times 4$ object-to-camera transformation matrix.
Return type:	numpy.ndarray

im_h¶

Image height.

Type:	int

im_w¶

Image width.

Type:	int

int_mat¶

$3\times 3$ intrinsics matrix.

Type:	numpy.ndarray

loc¶

Camera location in object space.

Type:	numpy.ndarray

lookat¶

Where in object space the camera points to.

Type:	numpy.ndarray

mm_per_pix¶

Millimeter per pixel.

Type:	float

name¶

Camera name.

Type:	str

proj(pts, space='object')[source]¶

Projects 3D points to 2D.

Parameters:

pts (array_like) – 3D point(s) of shape $N\times 3$ or $3\times N$, or of length 3.
space (str, optional) – In which space these points are specified: 'object' or 'camera'.

Returns:

Vertical and horizontal coordinates of the projections, following:

+-----------> dim1
|
|
|
v dim0

Return type:

array_like

proj_mat¶

$3\times 4$ projection matrix, derived from intrinsics and extrinsics.

Type:	numpy.ndarray

resize(new_h=None, new_w=None)[source]¶

Updates the camera intrinsics according to the new size.

Parameters:	new_h (int, optional) – Target height. If `None`, will be calculated according to the target width, assuming the same aspect ratio. new_w (int, optional) – Target width. If `None`, will be calculated according to the target height, assuming the same aspect ratio.

sensor_fit_horizontal¶

Whether field of view angle fits along the horizontal or vertical direction.

Type:	bool

sensor_h¶

Sensor’s physical height (fixed at 24mm).

Type:	float

sensor_h_active¶

Actual sensor height (mm) in use (resolution-dependent).

Type:	float

sensor_w¶

Sensor’s physical width (fixed at 36mm).

Type:	float

sensor_w_active¶

Actual sensor width (mm) in use (resolution-dependent).

Type:	float

set_from_mitsuba(xml_path)[source]¶

Sets camera according to a Mitsuba XML file.

Parameters:	xml_path (str) – Path to the XML file.

to_dict(app=None)[source]¶

Converts this camera to a dictionary of its properties.

Parameters:	app (str, optional) – For what application are we converting? Accepted are `None` and `'blender'`.
Returns:	This camera as a dictionary.
Return type:	dict

up¶

Up vector, the vector in object space that, when projected, points upward on image plane.

Type:	numpy.ndarray

xiuminglib.camera.safe_cast_to_int(x)[source]¶

Casts a string or float to integer only when safe.

Parameters:	x (str or float) – Input to be cast to integer.
Returns:	Integer version of the input.
Return type:	int

xiuminglib.const module¶

class xiuminglib.const.Dir[source]¶

Bases: object

data = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data'¶

mstatus = '/tmp/machine-status/runtime'¶

tmp = '/tmp/'¶

class xiuminglib.const.Path[source]¶

Bases: object

armadillo = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/armadillo.ply'¶

buddha = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip.ply'¶

buddha_prefix = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip'¶

buddha_res2 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip_res2.ply'¶

buddha_res3 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip_res3.ply'¶

buddha_res4 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip_res4.ply'¶

bunny = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper.ply'¶

bunny_prefix = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper'¶

bunny_res2 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper_res2.ply'¶

bunny_res3 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper_res3.ply'¶

bunny_res4 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper_res4.ply'¶

cameraman = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/images/cameraman.png'¶

checker = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/textures/checker.png'¶

cpustatus = '/tmp/cpu/machine_status.txt'¶

dragon = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip.ply'¶

dragon_prefix = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip'¶

dragon_res2 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip_res2.ply'¶

dragon_res3 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip_res3.ply'¶

dragon_res4 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip_res4.ply'¶

gpustatus = '/tmp/gpu/{machine_name}'¶

lenna = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/images/lenna.png'¶

lpips_weights = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/lpips/net-lin_alex_v0.1.pb'¶

open_sans_regular = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/fonts/open-sans/OpenSans-Regular.ttf'¶

teapot = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/teapot.obj'¶

xiuminglib.decor module¶

Decorators that wrap a function.

If the function is defined in the file where you want to use the decorator, you can decorate the function at define time:

@decorator
def somefunc():
    return

If the function is defined somewhere else, do:

from numpy import mean

mean = decorator(mean)

xiuminglib.decor.colossus_interface(somefunc)[source]¶

Wraps black-box functions to read from and write to Google Colossus.

Because it’s hard (if possible at all) to figure out which path is input, and which is output, when the input function is black-box, this is a “best-effort” decorator (see below for warnings).

This decorator works by looping through all the positional and keyword parameters, copying CNS paths that exist prior to somefunc execuation to temporary local locations, running somefunc and writing its output to local locations, and finally copying local paths that get modified by somefunc to their corresponding CNS locations.

Warning

Therefore, if somefunc’s output already exists (e.g., you are re-running the function to overwrite the old result), it will be copied to local, overwritten by somefunc locally, and finally copied back to CNS. This doesn’t lead to wrong behaviors, but is inefficient.

This decorator doesn’t depend on Blaze, as it’s using the fileutil CLI, rather than google3.pyglib.gfile. This is convenient in at least two cases:

You are too lazy to use Blaze, want to run tests quickly on your local machine, but need access to CNS files.
Your IO is more complex than what with gfile.Open(...) as h: can do (e.g., a Blender function importing an object from a path), in which case you have to copy the CNS file to local (“local” here could also mean a Borglet’s local).

This interface generally works with resolved paths (e.g., /path/to/file), but not with wildcard paths (e.g., /path/to/???), sicne it’s hard (if possible at all) to guess what your function tries to do with such wildcard paths.

Writes

Input files copied from Colossus to $TMP/.
Output files generated to $TMP/, to be copied to Colossus.

xiuminglib.decor.existok(makedirs_func)[source]¶: Implements the exist_ok flag in 3.2+, which avoids race conditions, where one parallel worker checks the folder doesn’t exist and wants to create it with another worker doing so faster.

xiuminglib.decor.main()[source]¶: Unit tests that can also serve as example usage.

xiuminglib.decor.timeit(somefunc)[source]¶: Outputs the time a function takes to execute.

xiuminglib.img module¶

xiuminglib.img.alpha_blend(arr1, alpha, arr2=None)[source]¶

Alpha-blends two arrays, or masks one array.

Parameters:	arr1 (numpy.ndarray) – Input array. alpha (numpy.ndarray) – Alpha map whose values are $\in [0,1]$. arr2 (numpy.ndarray) – Input array. If `None`, `arr1` will be blended with an all-zero array, equivalent to masking `arr1`.
Returns:	Blended array of type `float`.
Return type:	numpy.ndarray

xiuminglib.img.binarize(im, threshold=None)[source]¶

Binarizes images.

Parameters:	im (numpy.ndarray) – Image to binarize. Of any integer type (`uint8`, `uint16`, etc.). If H-by-W-by-3, will be converted to grayscale and treated as H-by-W. threshold (float, optional) – Threshold for binarization. `None` means midpoint of the `dtype`.
Returns:	Binarized image. Of only 0’s and 1’s.
Return type:	numpy.ndarray

xiuminglib.img.compute_gradients(im)[source]¶

Computes magnitudes and orientations of image gradients.

With Scharr operators:

[ 3 0 -3 ]           [ 3  10  3]
[10 0 -10]    and    [ 0   0  0]
[ 3 0 -3 ]           [-3 -10 -3]

Parameters:	im (numpy.ndarray) – H-by-W if single-channel (e.g., grayscale) or H-by-W-by-C if multi-channel (e.g., RGB) images. Gradients are computed independently for each of the C channels.
Returns:	grad_mag (numpy.ndarray) – Magnitude image of the gradients. grad_orient (numpy.ndarray) – Orientation image of the gradients (in radians). y ^ pi/2 \| pi \| --------+--------> 0 -pi \| x \| -pi/2
Return type:	tuple

xiuminglib.img.denormalize_float(arr, uint_type='uint8')[source]¶

De-normalizes the input float array such that $1$ becomes the target uint maximum.

Parameters:	arr (numpy.ndarray) – Input array of type `float`. uint_type (str, optional) – Target `uint` type.
Returns:	De-normalized array of the target type.
Return type:	numpy.ndarray

xiuminglib.img.find_local_extrema(im, want_maxima, kernel_size=3)[source]¶

Finds local maxima or minima in an image.

Parameters:	im (numpy.ndarray) – H-by-W if single-channel (e.g., grayscale) or H-by-W-by-C for multi-channel (e.g., RGB) images. Extrema are found independently for each of the C channels. want_maxima (bool) – Whether maxima or minima are wanted. kernel_size (int, optional) – Side length of the square window under consideration. Must be larger than 1.
Returns:	Binary map indicating if each pixel is a local extremum.
Return type:	numpy.ndarray

xiuminglib.img.gamma_correct(im, gamma=2.2)[source]¶

Applies gamma correction to an uint image.

Parameters:	im (numpy.ndarray) – H-by-W if single-channel (e.g., grayscale) or H-by-W-by-C multi-channel (e.g., RGB) `uint` images. gamma (float, optional) – Gamma value $< 1$ shifts image towards the darker end of the spectrum, while value $> 1$ towards the brighter.
Returns:	Gamma-corrected image.
Return type:	numpy.ndarray

xiuminglib.img.grid_query_img(im, query_x, query_y, method='bilinear')[source]¶

Grid queries an image via interpolation.

If you want to grid query unstructured data, consider grid_query_unstruct().

This function uses either bilinear interpolation that allows you to break big matrices into patches and work locally, or bivariate spline interpolation that fits a global spline (so memory-intensive) and shows global effects.

Parameters:	im (numpy.ndarray) – H-by-W or H-by-W-by-C rectangular grid of data. Each of C channels is interpolated independently. query_x (array_like) – $x$ coordinates of the queried rectangle, e.g., `np.arange(10)` for a 10-by-10 grid (hence, this should not be generated by `numpy.meshgrid()` or similar functions). query_y (array_like) – $y$ coordinates, following this convention: +---------> query_x \| \| \| v query_y method (str, optional) – Interpolation method: `'spline'` or `'bilinear'`.
Returns:	Interpolated values at query locations, of shape `(len(query_y), len(query_x))` for single-channel input or `(len(query_y), len(query_x), im.shape[2])` for multi-channel input.
Return type:	numpy.ndarray

xiuminglib.img.grid_query_unstruct(uvs, values, grid_res, method=None)[source]¶

Grid queries unstructured data given by coordinates and their values.

If you are looking to grid query structured data, such as an image, check out grid_query_img().

This function interpolates values on a rectangular grid given some sparse, unstrucured samples. One use case is where you have some UV locations and their associated colors, and you want to “paint the colors” on a UV canvas.

Parameters:

uvs (numpy.ndarray) – N-by-2 array of UV coordinates where we have values (e.g., colors). See xiuminglib.blender.object.smart_uv_unwrap() for the UV coordinate convention.
values (numpy.ndarray) – N-by-M array of M-D values at the N UV locations, or N-array of scalar values at the N UV locations. Channels are interpolated independently.
grid_res (array_like) – Resolution (height first; then width) of the query grid.

method (dict, optional) –

Dictionary of method-specific parameters. Implemented methods and their default parameters:

# Default
method = {
    'func': 'griddata',
    # Which SciPy function to call.

    'func_underlying': 'linear',
    # Fed to `griddata` as the `method` parameter.

    'fill_value': (0,), # black
    # Will be used to fill in pixels outside the convex hulls
    # formed by the UV locations, and if `max_l1_interp` is
    # provided, also the pixels whose interpolation is too much
    # of a stretch to be trusted. In the context of "canvas
    # painting," this will be the canvas' base color.

    'max_l1_interp': np.inf, # trust/accept all interpolations
    # Maximum L1 distance, which we can trust in interpolation,
    # to pixels that have values. Interpolation across a longer
    # range will not be trusted, and hence will be filled with
    # `fill_value`.
}

method = {
    'func': 'rbf',
    # Which SciPy function to call.

    'func_underlying': 'linear',
    # Fed to `Rbf` as the `method` parameter.

    'smooth': 0, # no smoothing
    # Fed to `Rbf` as the `smooth` parameter.
}

Returns:

Interpolated values at query locations, of shape grid_res for single-channel input or (grid_res[0], grid_res[1], values.shape[2]) for multi-channel input.

Return type:

numpy.ndarray

xiuminglib.img.linear2srgb(im, clip=False)[source]¶

Converts an image from linear RGB values to sRGB.

Parameters:	im (numpy.ndarray) – Of type `float`, and all pixels must be $\in [0, 1]$. clip (bool, optional) – Whether to clip values to $[0,1]$. Defaults to `False`.
Returns:	Converted image in sRGB.
Return type:	numpy.ndarray

xiuminglib.img.normalize_uint(arr)[source]¶

Normalizes the input uint array such that its dtype maximum becomes $1$.

Parameters:	arr (numpy.ndarray) – Input array of type `uint`.
Returns:	Normalized array of type `float`.
Return type:	numpy.ndarray

xiuminglib.img.remove_islands(im, min_n_pixels, connectivity=4)[source]¶

Removes small islands of pixels from a binary image.

Parameters:	im (numpy.ndarray) – Input binary image. Of only 0’s and 1’s. min_n_pixels (int) – Minimum island size to keep. connectivity (int, optional) – Definition of “connected”: either 4 or 8.
Returns:	Output image with small islands removed.
Return type:	numpy.ndarray

xiuminglib.img.resize(arr, new_h=None, new_w=None, method='cv2')[source]¶

Resizes an image, with the option of maintaining the aspect ratio.

Parameters:	arr (numpy.ndarray) – Image to binarize. If multiple-channel, each channel is resized independently. new_h (int, optional) – Target height. If `None`, will be calculated according to the target width, assuming the same aspect ratio. new_w (int, optional) – Target width. If `None`, will be calculated according to the target height, assuming the same aspect ratio. method (str, optional) – Accepted values: `'cv2'`, `'tf'`, and `'pil'`.
Returns:	Resized image.
Return type:	numpy.ndarray

xiuminglib.img.rgb2lum(im)[source]¶

Converts RGB to relative luminance (if input is linear RGB) or luma (if input is gamma-corrected RGB).

Parameters:	im (numpy.ndarray) – RGB array of shape `(..., 3)`.
Returns:	Relative luminance or luma array.
Return type:	numpy.ndarray

xiuminglib.img.srgb2linear(im, clip=False)[source]¶

Converts an image from sRGB values to linear RGB.

Parameters:	im (numpy.ndarray) – Of type `float`, and all pixels must be $\in [0, 1]$. clip (bool, optional) – Whether to clip values to $[0,1]$. Defaults to `False`.
Returns:	Converted image in linear RGB.
Return type:	numpy.ndarray

xiuminglib.img.tonemap(hdr, method='gamma', gamma=2.2)[source]¶

Tonemaps an HDR image.

Parameters:	hdr (numpy.ndarray) – HDR image. method (str, optional) – Values accepted: `'gamma'` and `'reinhard'`. gamma (float, optional) – Gamma value used if method is `'gamma'`.
Returns:	Tonemapped image $\in [0, 1]$.
Return type:	numpy.ndarray

xiuminglib.imprt module¶

xiuminglib.imprt.import_module_404ok(*args, **kwargs)[source]¶: Returns None (instead of failing) in the case of ModuleNotFoundError.

xiuminglib.imprt.preset_import(name, assert_success=False)[source]¶: A unified importer for both regular and google3 modules, according to specified presets/profiles (e.g., ignoring ModuleNotFoundError).

xiuminglib.interact module¶

xiuminglib.interact.ask_to_proceed(msg, level='warning')[source]¶

Pauses there to ask the user whether to proceed.

Parameters:	msg (str) – Message to display to the user. level (str, optional) – Message level, essentially deciding the message color: `'info'`, `'warning'`, or `'error'`.

xiuminglib.interact.format_print(msg, fmt)[source]¶

Prints a message with format.

Parameters:	msg (str) – Message to print. fmt (str) – Format; try your luck with any value – don’t worry; if it’s illegal, you will be prompted with all legal values.

xiuminglib.interact.print_attrs(obj, excerpts=None, excerpt_win_size=60, max_recursion_depth=None)[source]¶

Prints all attributes, recursively, of an object.

Parameters:	obj (object) – Object in which we search for the attribute. excerpts (str or list(str), optional) – Print only excerpts containing certain attributes. `None` means to print all. excerpt_win_size (int, optional) – How many characters get printed around a match. max_recursion_depth (int, optional) – Maximum recursion depth. `None` means no limit.

xiuminglib.linalg module¶

xiuminglib.linalg.angle_between(vec1, vec2, radian=True)[source]¶

Computes the angle between two vectors.

Parameters:	vec1 (array_like) – Vector 1. vec2 – radian (bool, optional) – Whether to use radians.
Returns:	The angle $\in [0,\pi]$.
Return type:	float

xiuminglib.linalg.calc_refl_vec(h, l)[source]¶

Calculates the reflection vector given the half vector.

Parameters:	h (array_like) – Half vector as a 3-array. l (array_like) – “Incident” vector (pointing outwards from the surface point), as a 3-array.
Returns:	Reflection vector as a 3-array.
Return type:	numpy.ndarray

xiuminglib.linalg.is_identity(mat, eps=None)[source]¶

Checks if a matrix is an identity matrix.

If the input is not even square, False is returned.

Parameters:	mat (numpy.ndarray) – Input matrix. eps (float, optional) – Numerical tolerance for equality. `None` means `np.finfo(mat.dtype).eps`.
Returns:	Whether the input is an identity matrix.
Return type:	bool

xiuminglib.linalg.is_symmetric(mat, eps=None)[source]¶

Checks if a matrix is symmetric.

If the input is not even square, False is returned.

Parameters:	mat (numpy.ndarray) – Input matrix. eps (float, optional) – Numerical tolerance for equality. `None` means `np.finfo(mat.dtype).eps`.
Returns:	Whether the input is symmetric.
Return type:	bool

xiuminglib.linalg.main(func_name)[source]¶: Unit tests that can also serve as example usage.

xiuminglib.linalg.normalize(vecs, axis=0)[source]¶

Normalizes vectors.

Parameters:	vecs (array_like) – 1D array for a single vector, 2D array for multiple vectors, 3D array for an “image” of vectors, etc. axis (int, optional) – Along which axis normalization is done.
Returns:	Normalized vector(s) of the same shape as input.
Return type:	numpy.ndarray

xiuminglib.linalg.project_onto(pts, basis)[source]¶

Projects points onto a basis vector.

Parameters:	pts (array_like) – 1D array for one vector; 2D N-by-M array for N M-D points. basis (array_like) – 1D M-array specifying which basis vector to project to.
Returns:	Projected point(s) of the same shape.
Return type:	numpy.ndarray

xiuminglib.linalg.solve_quadratic_eqn(a, b, c)[source]¶: Solves $ax^2+bx+c=0$.

xiuminglib.log module¶

xiuminglib.log.get_logger(level=None)[source]¶

Creates a logger for functions in the library.

Parameters:	level (str, optional) – Logging level. Defaults to `logging.INFO`.
Returns:	Logger created.
Return type:	logging.Logger

xiuminglib.metric module¶

class xiuminglib.metric.Base(dtype)[source]¶

Bases: object

The base metric.

dtype¶

Data type, with which data dynamic range is derived.

Type:	numpy.dtype

drange¶

Dynamic range, i.e., difference between the maximum and minimum allowed.

Type:	float

__call__(im1, im2, **kwargs)[source]¶

Parameters:	im1 (numpy.ndarray) – An image of shape H-by-W, H-by-W-by-1, or H-by-W-by-3. im2 –
Returns:	The metric computed.
Return type:	float

__init__(dtype)[source]¶

Parameters:	dtype (str or numpy.dtype) – Data type, from which dynamic range will be derived.

class xiuminglib.metric.LPIPS(dtype, weight_pb=None)[source]¶

Bases: xiuminglib.metric.Base

The Learned Perceptual Image Patch Similarity (LPIPS) metric (lower is better).

Project page: https://richzhang.github.io/PerceptualSimilarity/

Note

This implementation assumes the minimum value allowed is $0$, so data dynamic range becomes the maximum value allowed.

dtype¶

Data type, with which data dynamic range is derived.

Type:	numpy.dtype

drange¶

Dynamic range, i.e., difference between the maximum and minimum allowed.

Type:	float

lpips_func¶

The LPIPS network packed into a function.

Type:	tf.function

__call__(im1, im2)[source]¶

Parameters:	im1 – im2 –
Returns:	LPIPS computed (lower is better).
Return type:	float

__init__(dtype, weight_pb=None)[source]¶

Parameters:	dtype (str or numpy.dtype) – Data type, from which maximum allowed will be derived. weight_pb (str, optional) – Path to the network weight protobuf. Defaults to the bundled `net-lin_alex_v0.1.pb`.

class xiuminglib.metric.PSNR(dtype)[source]¶

Bases: xiuminglib.metric.Base

Peak Signal-to-Noise Ratio (PSNR) in dB (higher is better).

If the inputs are RGB, they are first converted to luma (or relative luminance, if the inputs are not gamma-corrected). PSNR is computed on the luma.

__call__(im1, im2, mask=None)[source]¶

Parameters:	im1 – im2 – mask (numpy.ndarray, optional) – An H-by-W logical array indicating pixels that contribute to the computation.
Returns:	PSNR in dB.
Return type:	float

class xiuminglib.metric.SSIM(dtype)[source]¶

Bases: xiuminglib.metric.Base

The (multi-scale) Structural Similarity Index (SSIM) $\in [0,1]$ (higher is better).

If the inputs are RGB, they are first converted to luma (or relative luminance, if the inputs are not gamma-corrected). SSIM is computed on the luma.

__call__(im1, im2, multiscale=False)[source]¶

Parameters:	im1 – im2 – multiscale (bool, optional) – Whether to compute MS-SSIM.
Returns:	SSIM computed (higher is better).
Return type:	float

xiuminglib.metric.compute_ci(data, level=0.95)[source]¶

Computes confidence interval.

Parameters:	data (list(float)) – Samples. level (float, optional) – Confidence level. Defaults to $0.95$.
Returns:	One-sided interval (i.e., mean $\pm$ this number).
Return type:	float

xiuminglib.os module¶

xiuminglib.os.call(cmd, cwd=None, wait=True, quiet=False)[source]¶

Executes a command in shell.

Parameters:

cmd (str) – Command to be executed.
cwd (str, optional) – Directory to execute the command in. None means current directory.
wait (bool, optional) – Whether to block until the call finishes.
quiet (bool, optional) – Whether to print out the output stream (if any) and error stream (if error occured).

Returns:

retcode (int) – Command exit code. 0 means a successful call. Always None if not waiting for the command to finish.
stdout (str) – Standard output stream. Always None if not waiting.
stderr (str) – Standard error stream. Always None if not waiting.

Return type:

tuple

xiuminglib.os.cp(src, dst, cns_parallel_copy=10)[source]¶

Copies files, possibly from/to the Google Colossus Filesystem.

Parameters:	src (str) – Source file or directory. dst (str) – Destination file or directory. cns_parallel_copy (int) – The number of files to be copied in parallel. Only effective when copying a directory from/to Colossus.

xiuminglib.os.exists_isdir(path)[source]¶

Determines whether a path exists, and if so, whether it is a file or directory.

Supports Google Colossus (CNS) paths by using gfile (preferred for speed) or the fileutil CLI.

Parameters:	path (str) – A path.
Returns:	exists (bool) – Whether the path exists. isdir (bool) – Whether the path is a file or directory. `None` if the path doesn’t exist.
Return type:	tuple

xiuminglib.os.fix_terminal()[source]¶: Fixes messed up terminal.

xiuminglib.os.make_exp_dir(directory, param_dict, rm_if_exists=False)[source]¶

Makes an experiment output folder by hashing the experiment parameters.

Parameters:	directory (str) – The made folder will be under this. param_dict (dict) – Dictionary of the parameters identifying the experiment. It is sorted by its keys, so different orders lead to the same hash. rm_if_exists (bool, optional) – Whether to remove the experiment folder if it already exists.

Writes

The experiment parameters in <directory>/<hash>/param.json.

Returns:	The experiment output folder just made.
Return type:	str

xiuminglib.os.makedirs(directory, rm_if_exists=False)[source]¶

Wraps os.makedirs() to support removing the directory if it alread exists.

Google Colossus-compatible: it tries to use gfile first for speed. This will fail if Blaze is not used, in which case it then falls back to using fileutil CLI as external process calls.

Parameters:	directory (str) – rm_if_exists (bool, optional) – Whether to remove the directory (and its contents) if it already exists.

xiuminglib.os.open_file(path, mode)[source]¶

Opens a file.

Supports Google Colossus if gfile can be imported.

Parameters:	path (str) – Path to open. mode (str) – `'r'`, `'rb'`, `'w'`, or `'wb'`.
Returns:	File handle that can be used as a context.

xiuminglib.os.rm(path)[source]¶

Removes a file or recursively a directory, with Google Colossus compatibility.

Parameters:	path (str) –

xiuminglib.os.sortglob(directory, filename='*', ext=None, ext_ignore_case=False)[source]¶

Globs and then sorts filenames, possibly ending with multiple extensions, in a directory.

Supports Google Colossus, by using gfile (preferred for speed) or the fileutil CLI when Blaze is not used (hence, gfile unavailable).

Parameters:	directory (str) – Directory to glob, e.g., `'/path/to/'`. filename (str or tuple(str), optional) – Filename pattern excluding extensions, e.g., `'img'`. ext* (str or tuple(str), optional) – Extensions of interest, e.g., `('png', 'jpg')`. `None` means no extension, useful for folders or files with no extension. ext_ignore_case (bool, optional) – Whether to ignore case for extensions.
Returns:	Sorted list of files globbed.
Return type:	list(str)

xiuminglib.sig module¶

xiuminglib.sig.dct_1d_bases(n)[source]¶

Generates 1D discrete cosine transform (DCT) bases.

Bases are rows of $Y$, which is orthogonal: $Y^TY=YY^T=I$. The forward process (analysis) is $X=Yx$, and the inverse (synthesis) is $x=Y^{-1}X=Y^TX$. See main() for example usages and how this produces the same results as scipy.fftpack.dct() (with type=2 and norm='ortho').

Parameters:	n (int) – Signal length.
Returns:	Matrix whose $i$-th row, when dotted with signal (column) vector, gives the coefficient for the $i$-th DCT component. Of shape `(n, n)`.
Return type:	numpy.ndarray

xiuminglib.sig.dct_2d_bases(h, w)[source]¶

Generates bases for 2D discrete cosine transform (DCT).

Bases are given in two matrices $Y_h$ and $Y_w$. See dct_1d_bases() for their properties. Note that $Y_w$ has already been transposed (hence, $Y_hxY_w$ instead of $Y_hxY_w^T$ below).

Input image $x$ should be transformed by both matrices (i.e., along both dimensions). Specifically, the analysis process is $X=Y_hxY_w$, and the synthesis process is $x=Y_h^TXY_w^T$. See main() for example usages.

Parameters:

h (int) – Image height.
w –

Returns:

dct_mat_h (numpy.ndarray) – DCT matrix $Y_h$ transforming rows of the 2D signal. Of shape (h, h).
dct_mat_w (numpy.ndarray) – $Y_w$ transformingcolumns. Of shape (w, w).

Return type:

tuple

xiuminglib.sig.dct_2d_bases_vec(h, w)[source]¶

Generates bases stored in a single matrix, along whose height 2D frequencies get raveled.

Using the “vectorization + Kronecker product” trick: $\operatorname{vec}(Y_hxY_w)=\left(Y_w^T\otimes Y_h\right) \operatorname{vec}(x)$. So unlike dct_2d_bases(), this function generates a single matrix $Y=Y_w^T\otimes Y_h$, whose $k$-th row is the flattened $(i, j)$-th basis, where $k=wi+j$.

Input image $x$ can be transformed with a single matrix multiplication. Specifically, the analysis process is $X=Y \operatorname{vec}(x)$, and the synthesis process is $x= \operatorname{unvec}(Y^TX)$. See main() for examples.

Warning

If you want to reconstruct the signal with only some (i.e., not all) bases, do not slice those rows out from $Y$ and use only their coefficients. Instead, you should use the full $Y$ matrix and set to zero the coefficients for the unused frequency components. See main() for examples.

Parameters:	h (int) – Image height. w –
Returns:	Matrix with flattened bases as rows. The $k$-th row, when `numpy.reshape()`’ed into `(h, w)`, is the :math:` (i, j)`-th frequency component, where $k=wi+j$. Of shape `(h * w, h * w)`.
Return type:	numpy.ndarray

xiuminglib.sig.dft_1d_bases(n)[source]¶

Generates 1D discrete Fourier transform (DFT) bases.

Bases are rows of $Y$, which is unitary ($Y^HY=YY^H=I$, where $Y^H$ is the conjugate transpose) and symmetric. The forward process (analysis) is $X=Yx$, and the inverse (synthesis) is $x=Y^{-1}X=Y^HX$. See main() for example usages.

Parameters:	n (int) – Signal length.
Returns:	Matrix whose $i$-th row, when dotted with signal (column) vector, gives the coefficient for the $i$-th Fourier component. Of shape `(n, n)`.
Return type:	numpy.ndarray

xiuminglib.sig.dft_2d_bases(h, w)[source]¶

Generates bases for 2D discrete Fourier transform (DFT).

Bases are given in two matrices $Y_h$ and $Y_w$. See dft_1d_bases() for their properties. Note that $Y_w$ has already been transposed.

Input image $x$ should be transformed by both matrices (i.e., along both dimensions). Specifically, the analysis process is $X=Y_hxY_w$, and the synthesis process is $x=Y_h^HXY_w^H$. See main() for example usages and how this produces the same results as numpy.fft.fft2() (with norm='ortho').

xiuminglib.tracker module¶

class xiuminglib.tracker.LucasKanadeTracker(frames, pts, backtrack_thres=1, lk_params=None)[source]¶

Bases: object

Lucas Kanade Tracker.

frames¶

Grayscale.

Type:	list(numpy.array)

pts¶

Type:	numpy.array

lk_params¶

Type:	dict

backtrack_thres¶

Type:	float

tracks¶

Positions of tracks from the $i$-th to $(i+1)$-th frame. Arrays are of shape N-by-2.

+------------>
|       tracks[:, 1]
|
|
v tracks[:, 0]

Type:	list(numpy.array)

can_backtrack¶

Whether each track can be back-tracked to the previous frame. Arrays should be Boolean.

Type:	list(numpy.array)

is_lost¶

Whether each track is lost in this frame. Arrays should be Boolean.

Type:	list(numpy.array)

__init__(frames, pts, backtrack_thres=1, lk_params=None)[source]¶

Parameters:

frames (list(numpy.array)) – Frame images in order. Arrays are either H-by-W or H-by-W-by-3, and will be converted to grayscale.
pts (array_like) –
Points to track in the first frame. Of shape N-by-2.
```
+------------>
|       pts[:, 1]
|
|
v pts[:, 0]
```
backtrack_thres (float, optional) – Largest pixel deviation in the $x$ or $y$ direction of a successful backtrack.
lk_params (dict, optional) – Keyword parameters for cv2.calcOpticalFlowPyrLK().

run(constrain=None)[source]¶

Runs tracking.

Parameters:	constrain (function, optional) – Function applied to tracks before being fed to the next round. It should take in an N-by-2 arrays as well as the current workspace (as a dictionary) and return another array.

vis(out_dir, marker_bgr=(0, 0, 255))[source]¶

Visualizes results.

Parameters:	out_dir (str) – Output directory. marker_bgr (tuple, optional) – Marker BGR color.

Writes

Each frame with tracked points marked out.