xiuminglib package

Submodules

xiuminglib.camera module

class xiuminglib.camera.PerspCam(name='cam', f_pix=533.33, im_res=(256, 256), loc=(1, 1, 1), lookat=(0, 0, 0), up=(0, 1, 0))[source]

Bases: object

Perspective camera in 35mm format.

This is not an OpenGL/Blender camera (where \(+x\) points right, \(+y\) up, and \(-z\) into the viewing direction), but rather a “CV camera” (where \(+x\) points right, \(+y\) down, and \(+z\) into the viewing direction). See more in ext_mat.

Because we mostly consider just the camera and the object, we assume the object coordinate system (the “local system” in Blender) aligns with (and hence, is the same as) the world coordinate system (the “global system” in Blender).

Note

  • Sensor width of the 35mm format is actually 36mm.
  • This class assumes unit pixel aspect ratio (i.e., \(f_x = f_y\)) and no skewing between the sensor plane and optical axis.
  • The active sensor size may be smaller than sensor_w and sensor_h, depending on im_res. See sensor_w_active and sensor_h_active.
  • aov, sensor_h, and sensor_w are hardware properties, having nothing to do with im_res.
__init__(name='cam', f_pix=533.33, im_res=(256, 256), loc=(1, 1, 1), lookat=(0, 0, 0), up=(0, 1, 0))[source]
Parameters:
  • name (str, optional) – Camera name.
  • f_pix (float, optional) – Focal length in pixel.
  • im_res (array_like, optional) – Image height and width in pixels.
  • loc (array_like, optional) – Camera location in object space.
  • lookat (array_like, optional) – Where the camera points to in object space, so default \((0, 0, 0)\) is the object center.
  • up (array_like, optional) – Vector in object space that, when projected, points upward in image.
aov

Vertical and horizontal angles of view in degrees.

Type:numpy.ndarray
backproj(depth, fg_mask=None, bg_fill=0.0, depth_type='plane', space='object')[source]

Backprojects a depth map to 3D points.

Resolution of the depth map may be different from im_h and im_w: im_h and im_w decide the image coordinate bounds, and the depth resolution decides number of steps.

Parameters:
  • depth (numpy.ndarray) – Depth map.
  • fg_mask (numpy.ndarray, optional) – Backproject only pixels falling inside this foreground mask. Its values should be logical.
  • bg_fill (flaot, optional) – Filler value for background region.
  • depth_type (str, optional) – Plane or ray depth.
  • space (str, optional) – In which space the backprojected points are specified: 'object' or 'camera'.
Returns:

\(xyz\) map.

Return type:

numpy.ndarray

blender_rot_euler

Euler rotations in degrees.

Type:numpy.ndarray
ext_mat

\(3\times 4\) object-to-camera extrinsics matrix, i.e., rotation and translation that transform a point from object space to camera space.

Two coordinate systems involved: object space “obj” and camera space following the computer vision convention “cv”, where \(+x\) horizontally points right (to align with pixel coordinates), \(+y\) vertically points down, and \(+z\) is the look-at direction (because right-handed).

Type:numpy.ndarray
ext_mat_4x4

Padding \([0, 0, 0, 1]\) to bottom of the \(3\times 4\) extrinsics matrix to make it invertible.

Type:numpy.ndarray
f_mm

35mm format-equivalent focal length in mm.

Type:float
f_pix

Focal length in pixels.

Type:float
gen_rays(spp=1)[source]

Generates ray directions in object space, with the ray origin being the camera location.

Parameters:spp (int, optional) – Samples (or number of rays) per pixel. Must be a perfect square \(S^2\) due to uniform, deterministic supersampling.
Returns:An \(H\times W\times S^2\times 3\) array of ray directions.
Return type:numpy.ndarray
get_cam2obj(cam_type='cv', square=False)[source]

Inverse of get_obj2cam().

One example use: calling this with cam_type='blender' gives Blender’s cam.matrix_world.

get_obj2cam(cam_type='cv', square=False)[source]

Gets the object-to-camera transformation matrix.

Parameters:
  • cam_type (str, optional) – Accepted are 'cv'/'opencv' and 'opengl'/'blender'.
  • square (bool, optional) – If true, the last row of \([0, 0, 0, 1]\) is kept, which makes the matrix invertible.
Returns:

\(3\times 4\) or \(4\times 4\) object-to-camera transformation matrix.

Return type:

numpy.ndarray

im_h

Image height.

Type:int
im_w

Image width.

Type:int
int_mat

\(3\times 3\) intrinsics matrix.

Type:numpy.ndarray
loc

Camera location in object space.

Type:numpy.ndarray
lookat

Where in object space the camera points to.

Type:numpy.ndarray
mm_per_pix

Millimeter per pixel.

Type:float
name

Camera name.

Type:str
proj(pts, space='object')[source]

Projects 3D points to 2D.

Parameters:
  • pts (array_like) – 3D point(s) of shape \(N\times 3\) or \(3\times N\), or of length 3.
  • space (str, optional) – In which space these points are specified: 'object' or 'camera'.
Returns:

Vertical and horizontal coordinates of the projections, following:

+-----------> dim1
|
|
|
v dim0

Return type:

array_like

proj_mat

\(3\times 4\) projection matrix, derived from intrinsics and extrinsics.

Type:numpy.ndarray
resize(new_h=None, new_w=None)[source]

Updates the camera intrinsics according to the new size.

Parameters:
  • new_h (int, optional) – Target height. If None, will be calculated according to the target width, assuming the same aspect ratio.
  • new_w (int, optional) – Target width. If None, will be calculated according to the target height, assuming the same aspect ratio.
sensor_fit_horizontal

Whether field of view angle fits along the horizontal or vertical direction.

Type:bool
sensor_h

Sensor’s physical height (fixed at 24mm).

Type:float
sensor_h_active

Actual sensor height (mm) in use (resolution-dependent).

Type:float
sensor_w

Sensor’s physical width (fixed at 36mm).

Type:float
sensor_w_active

Actual sensor width (mm) in use (resolution-dependent).

Type:float
set_from_mitsuba(xml_path)[source]

Sets camera according to a Mitsuba XML file.

Parameters:xml_path (str) – Path to the XML file.
to_dict(app=None)[source]

Converts this camera to a dictionary of its properties.

Parameters:app (str, optional) – For what application are we converting? Accepted are None and 'blender'.
Returns:This camera as a dictionary.
Return type:dict
up

Up vector, the vector in object space that, when projected, points upward on image plane.

Type:numpy.ndarray
xiuminglib.camera.safe_cast_to_int(x)[source]

Casts a string or float to integer only when safe.

Parameters:x (str or float) – Input to be cast to integer.
Returns:Integer version of the input.
Return type:int

xiuminglib.const module

class xiuminglib.const.Dir[source]

Bases: object

data = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data'
mstatus = '/tmp/machine-status/runtime'
tmp = '/tmp/'
class xiuminglib.const.Path[source]

Bases: object

armadillo = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/armadillo.ply'
buddha = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip.ply'
buddha_prefix = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip'
buddha_res2 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip_res2.ply'
buddha_res3 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip_res3.ply'
buddha_res4 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip_res4.ply'
bunny = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper.ply'
bunny_prefix = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper'
bunny_res2 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper_res2.ply'
bunny_res3 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper_res3.ply'
bunny_res4 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper_res4.ply'
cameraman = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/images/cameraman.png'
checker = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/textures/checker.png'
cpustatus = '/tmp/cpu/machine_status.txt'
dragon = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip.ply'
dragon_prefix = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip'
dragon_res2 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip_res2.ply'
dragon_res3 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip_res3.ply'
dragon_res4 = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip_res4.ply'
gpustatus = '/tmp/gpu/{machine_name}'
lenna = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/images/lenna.png'
lpips_weights = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/lpips/net-lin_alex_v0.1.pb'
open_sans_regular = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/fonts/open-sans/OpenSans-Regular.ttf'
teapot = '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/teapot.obj'

xiuminglib.decor module

Decorators that wrap a function.

If the function is defined in the file where you want to use the decorator, you can decorate the function at define time:

@decorator
def somefunc():
    return

If the function is defined somewhere else, do:

from numpy import mean

mean = decorator(mean)
xiuminglib.decor.colossus_interface(somefunc)[source]

Wraps black-box functions to read from and write to Google Colossus.

Because it’s hard (if possible at all) to figure out which path is input, and which is output, when the input function is black-box, this is a “best-effort” decorator (see below for warnings).

This decorator works by looping through all the positional and keyword parameters, copying CNS paths that exist prior to somefunc execuation to temporary local locations, running somefunc and writing its output to local locations, and finally copying local paths that get modified by somefunc to their corresponding CNS locations.

Warning

Therefore, if somefunc’s output already exists (e.g., you are re-running the function to overwrite the old result), it will be copied to local, overwritten by somefunc locally, and finally copied back to CNS. This doesn’t lead to wrong behaviors, but is inefficient.

This decorator doesn’t depend on Blaze, as it’s using the fileutil CLI, rather than google3.pyglib.gfile. This is convenient in at least two cases:

  • You are too lazy to use Blaze, want to run tests quickly on your local machine, but need access to CNS files.
  • Your IO is more complex than what with gfile.Open(...) as h: can do (e.g., a Blender function importing an object from a path), in which case you have to copy the CNS file to local (“local” here could also mean a Borglet’s local).

This interface generally works with resolved paths (e.g., /path/to/file), but not with wildcard paths (e.g., /path/to/???), sicne it’s hard (if possible at all) to guess what your function tries to do with such wildcard paths.

Writes
  • Input files copied from Colossus to $TMP/.
  • Output files generated to $TMP/, to be copied to Colossus.
xiuminglib.decor.existok(makedirs_func)[source]

Implements the exist_ok flag in 3.2+, which avoids race conditions, where one parallel worker checks the folder doesn’t exist and wants to create it with another worker doing so faster.

xiuminglib.decor.main()[source]

Unit tests that can also serve as example usage.

xiuminglib.decor.timeit(somefunc)[source]

Outputs the time a function takes to execute.

xiuminglib.img module

xiuminglib.img.alpha_blend(arr1, alpha, arr2=None)[source]

Alpha-blends two arrays, or masks one array.

Parameters:
  • arr1 (numpy.ndarray) – Input array.
  • alpha (numpy.ndarray) – Alpha map whose values are \(\in [0,1]\).
  • arr2 (numpy.ndarray) – Input array. If None, arr1 will be blended with an all-zero array, equivalent to masking arr1.
Returns:

Blended array of type float.

Return type:

numpy.ndarray

xiuminglib.img.binarize(im, threshold=None)[source]

Binarizes images.

Parameters:
  • im (numpy.ndarray) – Image to binarize. Of any integer type (uint8, uint16, etc.). If H-by-W-by-3, will be converted to grayscale and treated as H-by-W.
  • threshold (float, optional) – Threshold for binarization. None means midpoint of the dtype.
Returns:

Binarized image. Of only 0’s and 1’s.

Return type:

numpy.ndarray

xiuminglib.img.compute_gradients(im)[source]

Computes magnitudes and orientations of image gradients.

With Scharr operators:

[ 3 0 -3 ]           [ 3  10  3]
[10 0 -10]    and    [ 0   0  0]
[ 3 0 -3 ]           [-3 -10 -3]
Parameters:im (numpy.ndarray) – H-by-W if single-channel (e.g., grayscale) or H-by-W-by-C if multi-channel (e.g., RGB) images. Gradients are computed independently for each of the C channels.
Returns:
  • grad_mag (numpy.ndarray) – Magnitude image of the gradients.
  • grad_orient (numpy.ndarray) – Orientation image of the gradients (in radians).
           y ^ pi/2
             |
    pi       |
     --------+--------> 0
    -pi      |       x
             | -pi/2
    
Return type:tuple
xiuminglib.img.denormalize_float(arr, uint_type='uint8')[source]

De-normalizes the input float array such that \(1\) becomes the target uint maximum.

Parameters:
  • arr (numpy.ndarray) – Input array of type float.
  • uint_type (str, optional) – Target uint type.
Returns:

De-normalized array of the target type.

Return type:

numpy.ndarray

xiuminglib.img.find_local_extrema(im, want_maxima, kernel_size=3)[source]

Finds local maxima or minima in an image.

Parameters:
  • im (numpy.ndarray) – H-by-W if single-channel (e.g., grayscale) or H-by-W-by-C for multi-channel (e.g., RGB) images. Extrema are found independently for each of the C channels.
  • want_maxima (bool) – Whether maxima or minima are wanted.
  • kernel_size (int, optional) – Side length of the square window under consideration. Must be larger than 1.
Returns:

Binary map indicating if each pixel is a local extremum.

Return type:

numpy.ndarray

xiuminglib.img.gamma_correct(im, gamma=2.2)[source]

Applies gamma correction to an uint image.

Parameters:
  • im (numpy.ndarray) – H-by-W if single-channel (e.g., grayscale) or H-by-W-by-C multi-channel (e.g., RGB) uint images.
  • gamma (float, optional) – Gamma value \(< 1\) shifts image towards the darker end of the spectrum, while value \(> 1\) towards the brighter.
Returns:

Gamma-corrected image.

Return type:

numpy.ndarray

xiuminglib.img.grid_query_img(im, query_x, query_y, method='bilinear')[source]

Grid queries an image via interpolation.

If you want to grid query unstructured data, consider grid_query_unstruct().

This function uses either bilinear interpolation that allows you to break big matrices into patches and work locally, or bivariate spline interpolation that fits a global spline (so memory-intensive) and shows global effects.

Parameters:
  • im (numpy.ndarray) – H-by-W or H-by-W-by-C rectangular grid of data. Each of C channels is interpolated independently.
  • query_x (array_like) – \(x\) coordinates of the queried rectangle, e.g., np.arange(10) for a 10-by-10 grid (hence, this should not be generated by numpy.meshgrid() or similar functions).
  • query_y (array_like) –

    \(y\) coordinates, following this convention:

    +---------> query_x
    |
    |
    |
    v query_y
    
  • method (str, optional) – Interpolation method: 'spline' or 'bilinear'.
Returns:

Interpolated values at query locations, of shape (len(query_y), len(query_x)) for single-channel input or (len(query_y), len(query_x), im.shape[2]) for multi-channel input.

Return type:

numpy.ndarray

xiuminglib.img.grid_query_unstruct(uvs, values, grid_res, method=None)[source]

Grid queries unstructured data given by coordinates and their values.

If you are looking to grid query structured data, such as an image, check out grid_query_img().

This function interpolates values on a rectangular grid given some sparse, unstrucured samples. One use case is where you have some UV locations and their associated colors, and you want to “paint the colors” on a UV canvas.

Parameters:
  • uvs (numpy.ndarray) – N-by-2 array of UV coordinates where we have values (e.g., colors). See xiuminglib.blender.object.smart_uv_unwrap() for the UV coordinate convention.
  • values (numpy.ndarray) – N-by-M array of M-D values at the N UV locations, or N-array of scalar values at the N UV locations. Channels are interpolated independently.
  • grid_res (array_like) – Resolution (height first; then width) of the query grid.
  • method (dict, optional) –

    Dictionary of method-specific parameters. Implemented methods and their default parameters:

    # Default
    method = {
        'func': 'griddata',
        # Which SciPy function to call.
    
        'func_underlying': 'linear',
        # Fed to `griddata` as the `method` parameter.
    
        'fill_value': (0,), # black
        # Will be used to fill in pixels outside the convex hulls
        # formed by the UV locations, and if `max_l1_interp` is
        # provided, also the pixels whose interpolation is too much
        # of a stretch to be trusted. In the context of "canvas
        # painting," this will be the canvas' base color.
    
        'max_l1_interp': np.inf, # trust/accept all interpolations
        # Maximum L1 distance, which we can trust in interpolation,
        # to pixels that have values. Interpolation across a longer
        # range will not be trusted, and hence will be filled with
        # `fill_value`.
    }
    
    method = {
        'func': 'rbf',
        # Which SciPy function to call.
    
        'func_underlying': 'linear',
        # Fed to `Rbf` as the `method` parameter.
    
        'smooth': 0, # no smoothing
        # Fed to `Rbf` as the `smooth` parameter.
    }
    
Returns:

Interpolated values at query locations, of shape grid_res for single-channel input or (grid_res[0], grid_res[1], values.shape[2]) for multi-channel input.

Return type:

numpy.ndarray

xiuminglib.img.linear2srgb(im, clip=False)[source]

Converts an image from linear RGB values to sRGB.

Parameters:
  • im (numpy.ndarray) – Of type float, and all pixels must be \(\in [0, 1]\).
  • clip (bool, optional) – Whether to clip values to \([0,1]\). Defaults to False.
Returns:

Converted image in sRGB.

Return type:

numpy.ndarray

xiuminglib.img.normalize_uint(arr)[source]

Normalizes the input uint array such that its dtype maximum becomes \(1\).

Parameters:arr (numpy.ndarray) – Input array of type uint.
Returns:Normalized array of type float.
Return type:numpy.ndarray
xiuminglib.img.remove_islands(im, min_n_pixels, connectivity=4)[source]

Removes small islands of pixels from a binary image.

Parameters:
  • im (numpy.ndarray) – Input binary image. Of only 0’s and 1’s.
  • min_n_pixels (int) – Minimum island size to keep.
  • connectivity (int, optional) – Definition of “connected”: either 4 or 8.
Returns:

Output image with small islands removed.

Return type:

numpy.ndarray

xiuminglib.img.resize(arr, new_h=None, new_w=None, method='cv2')[source]

Resizes an image, with the option of maintaining the aspect ratio.

Parameters:
  • arr (numpy.ndarray) – Image to binarize. If multiple-channel, each channel is resized independently.
  • new_h (int, optional) – Target height. If None, will be calculated according to the target width, assuming the same aspect ratio.
  • new_w (int, optional) – Target width. If None, will be calculated according to the target height, assuming the same aspect ratio.
  • method (str, optional) – Accepted values: 'cv2', 'tf', and 'pil'.
Returns:

Resized image.

Return type:

numpy.ndarray

xiuminglib.img.rgb2lum(im)[source]

Converts RGB to relative luminance (if input is linear RGB) or luma (if input is gamma-corrected RGB).

Parameters:im (numpy.ndarray) – RGB array of shape (..., 3).
Returns:Relative luminance or luma array.
Return type:numpy.ndarray
xiuminglib.img.srgb2linear(im, clip=False)[source]

Converts an image from sRGB values to linear RGB.

Parameters:
  • im (numpy.ndarray) – Of type float, and all pixels must be \(\in [0, 1]\).
  • clip (bool, optional) – Whether to clip values to \([0,1]\). Defaults to False.
Returns:

Converted image in linear RGB.

Return type:

numpy.ndarray

xiuminglib.img.tonemap(hdr, method='gamma', gamma=2.2)[source]

Tonemaps an HDR image.

Parameters:
  • hdr (numpy.ndarray) – HDR image.
  • method (str, optional) – Values accepted: 'gamma' and 'reinhard'.
  • gamma (float, optional) – Gamma value used if method is 'gamma'.
Returns:

Tonemapped image \(\in [0, 1]\).

Return type:

numpy.ndarray

xiuminglib.imprt module

xiuminglib.imprt.import_module_404ok(*args, **kwargs)[source]

Returns None (instead of failing) in the case of ModuleNotFoundError.

xiuminglib.imprt.preset_import(name, assert_success=False)[source]

A unified importer for both regular and google3 modules, according to specified presets/profiles (e.g., ignoring ModuleNotFoundError).

xiuminglib.interact module

xiuminglib.interact.ask_to_proceed(msg, level='warning')[source]

Pauses there to ask the user whether to proceed.

Parameters:
  • msg (str) – Message to display to the user.
  • level (str, optional) – Message level, essentially deciding the message color: 'info', 'warning', or 'error'.
xiuminglib.interact.format_print(msg, fmt)[source]

Prints a message with format.

Parameters:
  • msg (str) – Message to print.
  • fmt (str) – Format; try your luck with any value – don’t worry; if it’s illegal, you will be prompted with all legal values.
xiuminglib.interact.print_attrs(obj, excerpts=None, excerpt_win_size=60, max_recursion_depth=None)[source]

Prints all attributes, recursively, of an object.

Parameters:
  • obj (object) – Object in which we search for the attribute.
  • excerpts (str or list(str), optional) – Print only excerpts containing certain attributes. None means to print all.
  • excerpt_win_size (int, optional) – How many characters get printed around a match.
  • max_recursion_depth (int, optional) – Maximum recursion depth. None means no limit.

xiuminglib.linalg module

xiuminglib.linalg.angle_between(vec1, vec2, radian=True)[source]

Computes the angle between two vectors.

Parameters:
  • vec1 (array_like) – Vector 1.
  • vec2
  • radian (bool, optional) – Whether to use radians.
Returns:

The angle \(\in [0,\pi]\).

Return type:

float

xiuminglib.linalg.calc_refl_vec(h, l)[source]

Calculates the reflection vector given the half vector.

Parameters:
  • h (array_like) – Half vector as a 3-array.
  • l (array_like) – “Incident” vector (pointing outwards from the surface point), as a 3-array.
Returns:

Reflection vector as a 3-array.

Return type:

numpy.ndarray

xiuminglib.linalg.is_identity(mat, eps=None)[source]

Checks if a matrix is an identity matrix.

If the input is not even square, False is returned.

Parameters:
  • mat (numpy.ndarray) – Input matrix.
  • eps (float, optional) – Numerical tolerance for equality. None means np.finfo(mat.dtype).eps.
Returns:

Whether the input is an identity matrix.

Return type:

bool

xiuminglib.linalg.is_symmetric(mat, eps=None)[source]

Checks if a matrix is symmetric.

If the input is not even square, False is returned.

Parameters:
  • mat (numpy.ndarray) – Input matrix.
  • eps (float, optional) – Numerical tolerance for equality. None means np.finfo(mat.dtype).eps.
Returns:

Whether the input is symmetric.

Return type:

bool

xiuminglib.linalg.main(func_name)[source]

Unit tests that can also serve as example usage.

xiuminglib.linalg.normalize(vecs, axis=0)[source]

Normalizes vectors.

Parameters:
  • vecs (array_like) – 1D array for a single vector, 2D array for multiple vectors, 3D array for an “image” of vectors, etc.
  • axis (int, optional) – Along which axis normalization is done.
Returns:

Normalized vector(s) of the same shape as input.

Return type:

numpy.ndarray

xiuminglib.linalg.project_onto(pts, basis)[source]

Projects points onto a basis vector.

Parameters:
  • pts (array_like) – 1D array for one vector; 2D N-by-M array for N M-D points.
  • basis (array_like) – 1D M-array specifying which basis vector to project to.
Returns:

Projected point(s) of the same shape.

Return type:

numpy.ndarray

xiuminglib.linalg.solve_quadratic_eqn(a, b, c)[source]

Solves \(ax^2+bx+c=0\).

xiuminglib.log module

xiuminglib.log.get_logger(level=None)[source]

Creates a logger for functions in the library.

Parameters:level (str, optional) – Logging level. Defaults to logging.INFO.
Returns:Logger created.
Return type:logging.Logger

xiuminglib.metric module

class xiuminglib.metric.Base(dtype)[source]

Bases: object

The base metric.

dtype

Data type, with which data dynamic range is derived.

Type:numpy.dtype
drange

Dynamic range, i.e., difference between the maximum and minimum allowed.

Type:float
__call__(im1, im2, **kwargs)[source]
Parameters:
  • im1 (numpy.ndarray) – An image of shape H-by-W, H-by-W-by-1, or H-by-W-by-3.
  • im2
Returns:

The metric computed.

Return type:

float

__init__(dtype)[source]
Parameters:dtype (str or numpy.dtype) – Data type, from which dynamic range will be derived.
class xiuminglib.metric.LPIPS(dtype, weight_pb=None)[source]

Bases: xiuminglib.metric.Base

The Learned Perceptual Image Patch Similarity (LPIPS) metric (lower is better).

Project page: https://richzhang.github.io/PerceptualSimilarity/

Note

This implementation assumes the minimum value allowed is \(0\), so data dynamic range becomes the maximum value allowed.

dtype

Data type, with which data dynamic range is derived.

Type:numpy.dtype
drange

Dynamic range, i.e., difference between the maximum and minimum allowed.

Type:float
lpips_func

The LPIPS network packed into a function.

Type:tf.function
__call__(im1, im2)[source]
Parameters:
  • im1
  • im2
Returns:

LPIPS computed (lower is better).

Return type:

float

__init__(dtype, weight_pb=None)[source]
Parameters:
  • dtype (str or numpy.dtype) – Data type, from which maximum allowed will be derived.
  • weight_pb (str, optional) – Path to the network weight protobuf. Defaults to the bundled net-lin_alex_v0.1.pb.
class xiuminglib.metric.PSNR(dtype)[source]

Bases: xiuminglib.metric.Base

Peak Signal-to-Noise Ratio (PSNR) in dB (higher is better).

If the inputs are RGB, they are first converted to luma (or relative luminance, if the inputs are not gamma-corrected). PSNR is computed on the luma.

__call__(im1, im2, mask=None)[source]
Parameters:
  • im1
  • im2
  • mask (numpy.ndarray, optional) – An H-by-W logical array indicating pixels that contribute to the computation.
Returns:

PSNR in dB.

Return type:

float

class xiuminglib.metric.SSIM(dtype)[source]

Bases: xiuminglib.metric.Base

The (multi-scale) Structural Similarity Index (SSIM) \(\in [0,1]\) (higher is better).

If the inputs are RGB, they are first converted to luma (or relative luminance, if the inputs are not gamma-corrected). SSIM is computed on the luma.

__call__(im1, im2, multiscale=False)[source]
Parameters:
  • im1
  • im2
  • multiscale (bool, optional) – Whether to compute MS-SSIM.
Returns:

SSIM computed (higher is better).

Return type:

float

xiuminglib.metric.compute_ci(data, level=0.95)[source]

Computes confidence interval.

Parameters:
  • data (list(float)) – Samples.
  • level (float, optional) – Confidence level. Defaults to \(0.95\).
Returns:

One-sided interval (i.e., mean \(\pm\) this number).

Return type:

float

xiuminglib.os module

xiuminglib.os.call(cmd, cwd=None, wait=True, quiet=False)[source]

Executes a command in shell.

Parameters:
  • cmd (str) – Command to be executed.
  • cwd (str, optional) – Directory to execute the command in. None means current directory.
  • wait (bool, optional) – Whether to block until the call finishes.
  • quiet (bool, optional) – Whether to print out the output stream (if any) and error stream (if error occured).
Returns:

  • retcode (int) – Command exit code. 0 means a successful call. Always None if not waiting for the command to finish.
  • stdout (str) – Standard output stream. Always None if not waiting.
  • stderr (str) – Standard error stream. Always None if not waiting.

Return type:

tuple

xiuminglib.os.cp(src, dst, cns_parallel_copy=10)[source]

Copies files, possibly from/to the Google Colossus Filesystem.

Parameters:
  • src (str) – Source file or directory.
  • dst (str) – Destination file or directory.
  • cns_parallel_copy (int) – The number of files to be copied in parallel. Only effective when copying a directory from/to Colossus.
xiuminglib.os.exists_isdir(path)[source]

Determines whether a path exists, and if so, whether it is a file or directory.

Supports Google Colossus (CNS) paths by using gfile (preferred for speed) or the fileutil CLI.

Parameters:path (str) – A path.
Returns:
  • exists (bool) – Whether the path exists.
  • isdir (bool) – Whether the path is a file or directory. None if the path doesn’t exist.
Return type:tuple
xiuminglib.os.fix_terminal()[source]

Fixes messed up terminal.

xiuminglib.os.make_exp_dir(directory, param_dict, rm_if_exists=False)[source]

Makes an experiment output folder by hashing the experiment parameters.

Parameters:
  • directory (str) – The made folder will be under this.
  • param_dict (dict) – Dictionary of the parameters identifying the experiment. It is sorted by its keys, so different orders lead to the same hash.
  • rm_if_exists (bool, optional) – Whether to remove the experiment folder if it already exists.
Writes
  • The experiment parameters in <directory>/<hash>/param.json.
Returns:The experiment output folder just made.
Return type:str
xiuminglib.os.makedirs(directory, rm_if_exists=False)[source]

Wraps os.makedirs() to support removing the directory if it alread exists.

Google Colossus-compatible: it tries to use gfile first for speed. This will fail if Blaze is not used, in which case it then falls back to using fileutil CLI as external process calls.

Parameters:
  • directory (str) –
  • rm_if_exists (bool, optional) – Whether to remove the directory (and its contents) if it already exists.
xiuminglib.os.open_file(path, mode)[source]

Opens a file.

Supports Google Colossus if gfile can be imported.

Parameters:
  • path (str) – Path to open.
  • mode (str) – 'r', 'rb', 'w', or 'wb'.
Returns:

File handle that can be used as a context.

xiuminglib.os.rm(path)[source]

Removes a file or recursively a directory, with Google Colossus compatibility.

Parameters:path (str) –
xiuminglib.os.sortglob(directory, filename='*', ext=None, ext_ignore_case=False)[source]

Globs and then sorts filenames, possibly ending with multiple extensions, in a directory.

Supports Google Colossus, by using gfile (preferred for speed) or the fileutil CLI when Blaze is not used (hence, gfile unavailable).

Parameters:
  • directory (str) – Directory to glob, e.g., '/path/to/'.
  • filename (str or tuple(str), optional) – Filename pattern excluding extensions, e.g., 'img*'.
  • ext (str or tuple(str), optional) – Extensions of interest, e.g., ('png', 'jpg'). None means no extension, useful for folders or files with no extension.
  • ext_ignore_case (bool, optional) – Whether to ignore case for extensions.
Returns:

Sorted list of files globbed.

Return type:

list(str)

xiuminglib.sig module

xiuminglib.sig.dct_1d_bases(n)[source]

Generates 1D discrete cosine transform (DCT) bases.

Bases are rows of \(Y\), which is orthogonal: \(Y^TY=YY^T=I\). The forward process (analysis) is \(X=Yx\), and the inverse (synthesis) is \(x=Y^{-1}X=Y^TX\). See main() for example usages and how this produces the same results as scipy.fftpack.dct() (with type=2 and norm='ortho').

Parameters:n (int) – Signal length.
Returns:Matrix whose \(i\)-th row, when dotted with signal (column) vector, gives the coefficient for the \(i\)-th DCT component. Of shape (n, n).
Return type:numpy.ndarray
xiuminglib.sig.dct_2d_bases(h, w)[source]

Generates bases for 2D discrete cosine transform (DCT).

Bases are given in two matrices \(Y_h\) and \(Y_w\). See dct_1d_bases() for their properties. Note that \(Y_w\) has already been transposed (hence, \(Y_hxY_w\) instead of \(Y_hxY_w^T\) below).

Input image \(x\) should be transformed by both matrices (i.e., along both dimensions). Specifically, the analysis process is \(X=Y_hxY_w\), and the synthesis process is \(x=Y_h^TXY_w^T\). See main() for example usages.

Parameters:
  • h (int) – Image height.
  • w
Returns:

  • dct_mat_h (numpy.ndarray) – DCT matrix \(Y_h\) transforming rows of the 2D signal. Of shape (h, h).
  • dct_mat_w (numpy.ndarray) – \(Y_w\) transformingcolumns. Of shape (w, w).

Return type:

tuple

xiuminglib.sig.dct_2d_bases_vec(h, w)[source]

Generates bases stored in a single matrix, along whose height 2D frequencies get raveled.

Using the “vectorization + Kronecker product” trick: \(\operatorname{vec}(Y_hxY_w)=\left(Y_w^T\otimes Y_h\right) \operatorname{vec}(x)\). So unlike dct_2d_bases(), this function generates a single matrix \(Y=Y_w^T\otimes Y_h\), whose \(k\)-th row is the flattened \((i, j)\)-th basis, where \(k=wi+j\).

Input image \(x\) can be transformed with a single matrix multiplication. Specifically, the analysis process is \(X=Y \operatorname{vec}(x)\), and the synthesis process is \(x= \operatorname{unvec}(Y^TX)\). See main() for examples.

Warning

If you want to reconstruct the signal with only some (i.e., not all) bases, do not slice those rows out from \(Y\) and use only their coefficients. Instead, you should use the full \(Y\) matrix and set to zero the coefficients for the unused frequency components. See main() for examples.

Parameters:
  • h (int) – Image height.
  • w
Returns:

Matrix with flattened bases as rows. The \(k\)-th row, when numpy.reshape()’ed into (h, w), is the :math:` (i, j)`-th frequency component, where \(k=wi+j\). Of shape (h * w, h * w).

Return type:

numpy.ndarray

xiuminglib.sig.dft_1d_bases(n)[source]

Generates 1D discrete Fourier transform (DFT) bases.

Bases are rows of \(Y\), which is unitary (\(Y^HY=YY^H=I\), where \(Y^H\) is the conjugate transpose) and symmetric. The forward process (analysis) is \(X=Yx\), and the inverse (synthesis) is \(x=Y^{-1}X=Y^HX\). See main() for example usages.

Parameters:n (int) – Signal length.
Returns:Matrix whose \(i\)-th row, when dotted with signal (column) vector, gives the coefficient for the \(i\)-th Fourier component. Of shape (n, n).
Return type:numpy.ndarray
xiuminglib.sig.dft_2d_bases(h, w)[source]

Generates bases for 2D discrete Fourier transform (DFT).

Bases are given in two matrices \(Y_h\) and \(Y_w\). See dft_1d_bases() for their properties. Note that \(Y_w\) has already been transposed.

Input image \(x\) should be transformed by both matrices (i.e., along both dimensions). Specifically, the analysis process is \(X=Y_hxY_w\), and the synthesis process is \(x=Y_h^HXY_w^H\). See main() for example usages and how this produces the same results as numpy.fft.fft2() (with norm='ortho').

See also

From numpy.fftA[1:n/2] contains the positive-frequency terms, and A[n/2+1:] contains the negative-frequency terms, in order of decreasingly negative frequency. For an even number of input points, A[n/2] represents both positive and negative Nyquist frequency, and is also purely real for real input. For an odd number of input points, A[(n-1)/2] contains the largest positive frequency, while A[(n+1)/2] contains the largest negative frequency.

Parameters:
  • h (int) – Image height.
  • w
Returns:

  • dft_mat_h (numpy.ndarray) – DFT matrix \(Y_h\) transforming rows of the 2D signal. Of shape (h, h).
  • dft_mat_w (numpy.ndarray) – \(Y_w\) transforming columns. Of shape (w, w).

Return type:

tuple

xiuminglib.sig.dft_2d_bases_vec(h, w)[source]

Generates bases stored in a single matrix, along whose height 2D frequencies get raveled.

Using the “vectorization + Kronecker product” trick: \(\operatorname{vec}(Y_hxY_w)=\left(Y_w^T\otimes Y_h\right) \operatorname{vec}(x)\). So unlike dft_2d_bases(), this function generates a single matrix \(Y=Y_w^T\otimes Y_h\), whose \(k\)-th row is the flattened \((i, j)\)-th basis, where \(k=wi+j\).

Input image \(x\) can be transformed with a single matrix multiplication. Specifically, the analysis process is \(X=Y\operatorname{vec}(x)\), and the synthesis process is \(x=\operatorname{unvec}(Y^HX)\). See main() for examples.

Parameters:
  • h (int) – Image height.
  • w
Returns:

Complex matrix with flattened bases as rows. The \(k\)-th row, when numpy.reshape()’ed into (h, w), is the \((i, j)\)-th frequency component, where \(k=wi+j\). Of shape (h * w, h * w).

Return type:

numpy.ndarray

xiuminglib.sig.dft_2d_freq(h, w)[source]

Gets 2D discrete Fourier transform (DFT) sample frequencies.

Parameters:
  • h (int) – Image height.
  • w
Returns:

  • freq_h (numpy.ndarray) – Sample frequencies, in cycles per pixel, along the height dimension. E.g., if freq_h[i, j] == 0.5, then the (i, j)-th component repeats every 2 pixels along the height dimension.
  • freq_w

Return type:

tuple

xiuminglib.sig.get_extrema(arr, top=True, n=1, n_std=None)[source]

Gets top (or bottom) N value(s) from an M-D array, with the option to ignore outliers.

Parameters:
  • arr (array_like) – Array, which will be flattened if high-D.
  • top (bool, optional) – Whether to find the top or bottom N.
  • n (int, optional) – Number of values to return.
  • n_std (float, optional) – Definition of outliers to exclude, assuming Gaussian. None means assuming no outlier.
Returns:

  • ind (tuple) – Indices that give the extrema, M-tuple of arrays of N integers.
  • val (numpy.ndarray) – Extremum values, i.e., arr[ind].

Return type:

tuple

xiuminglib.sig.main(test_id)[source]

Unit tests that can also serve as example usage.

xiuminglib.sig.pca(data_mat, n_pcs=None, eig_method='scipy.sparse.linalg.eigsh')[source]

Performs principal component (PC) analysis on data.

Via eigendecomposition of covariance matrix. See main() for example usages, including reconstructing data with top K PCs.

Parameters:
  • data_mat (array_like) – Data matrix of N data points in the M-D space, of shape M-by-N, where each column is a point.
  • n_pcs (int, optional) – Number of top PCs requested. None means \(M-1\).
  • eig_method (str, optional) – Method for eigendecomposition of the symmetric covariance matrix: 'numpy.linalg.eigh' or 'scipy.sparse.linalg.eigsh'.
Returns:

  • pcvars (numpy.ndarray) – PC variances (eigenvalues of covariance matrix) in descending order.
  • pcs (numpy.ndarray) – Corresponding PCs (normalized eigenvectors), of shape M-by-n_pcs. Each column is a PC.
  • projs (numpy.ndarray) – Data points centered and then projected to the n_pcs-D PC space. Of shape n_pcs-by-N. Each column is a point.
  • data_mean (numpy.ndarray) – Mean that can be used to recover raw data. Of length M.

Return type:

tuple

xiuminglib.sig.sh_bases_real(l, n_lat, coord_convention='colatitude-azimuth', _check_orthonormality=False)[source]

Generates real spherical harmonics (SHs).

See main() for example usages, including how to do both analysis and synthesis the SHs.

Not accurate when n_lat is too small. E.g., orthonormality no longer holds when discretization is too coarse (small n_lat), as numerical integration fails to approximate the continuous integration.

Parameters:
  • l (int) – Up to which band (starting form 0). The number of harmonics is \((l+1)^2\). In other words, all harmonics within each band (\(-l\leq m\leq l\)) are used.
  • n_lat (int) – Number of discretization levels of colatitude (for colatitude-azimuth convention; \([0, \pi]\)) or latitude (for latitude-longitude convention; \([-\frac{\pi}{2}, \frac{\pi}{2}]\)). With the same step size, n_azimuth will be twice as big, since azimuth (in colatitude-azimuth convention; \([0, 2\pi]\)) or latitude (in latitude-longitude convention; \([-\pi, \pi]\)) spans \(2\pi\).
  • coord_convention (str, optional) –

    Coordinate system convention to use: 'colatitude-azimuth' or 'latitude-longitude'. Colatitude-azimuth vs. latitude-longitude convention:

    3D
                                       ^ z (colat = 0; lat = pi/2)
                                       |
              (azi = 3pi/2;            |
               lng = -pi/2)   ---------+---------> y (azi = pi/2;
                                     ,'|              lng = pi/2)
                                   ,'  |
        (colat = pi/2, azi = 0;  x     | (colat = pi; lat = -pi/2)
         lat = 0, lng = 0)
    
    2D
        (0, 0)                               (pi/2, 0)
           +----------->  (0, 2pi)               ^ lat
           |            azi                      |
           |                                     |
           |                     (0, -pi) -------+-------> (0, pi)
           v colat                               |        lng
        (pi, 0)                                  |
                                            (-pi/2, 0)
    
  • _check_orthonormality (bool, optional, internal) – Whether to check orthonormal or not.
Returns:

  • ymat (numpy.ndarray) – Matrix whose rows are spherical harmonics as generated by scipy.special.sph_harm(). When dotted with flattened image (column) vector weighted by areas_on_unit_sphere, the \(i\)-th row gives the coefficient for the \(i\)-th harmonics, where \(i=l(l+1)+m\). The input signal (in the form of 2D image indexed by two angles) should be flattened with numpy.ndarray.ravel(), in row-major order: the row index varies the slowest, and the column index the quickest. Of shape ((l + 1) ** 2, 2 * n_lat ** 2).
  • areas_on_unit_sphere (numpy.ndarray) – Area of the unit sphere covered by each sample point. This is proportional to sine of colatitude and has nothing to do with azimuth/longitude. Used as weights for discrete summation to approximate continuous integration. Necessary in SH analysis. Flattened also in row-major order. Of length n_lat * (2 * n_lat).

Return type:

tuple

xiuminglib.sig.smooth_1d(arr, win_size, kernel_type='half')[source]

Smooths 1D signal.

Parameters:
  • arr (array_like) – 1D signal to smooth.
  • win_size (int) – Size of the smoothing window. Use odd number.
  • kernel_type (str, optional) – Kernel type: 'half' (e.g., normalized \([2^{-2}, 2^{-1}, 2^0, 2^{-1}, 2^{-2}]\)) or 'equal' (e.g., normalized \([1, 1, 1, 1, 1]\)).
Returns:

Smoothed 1D signal.

Return type:

numpy.ndarray

xiuminglib.tracker module

class xiuminglib.tracker.LucasKanadeTracker(frames, pts, backtrack_thres=1, lk_params=None)[source]

Bases: object

Lucas Kanade Tracker.

frames

Grayscale.

Type:list(numpy.array)
pts
Type:numpy.array
lk_params
Type:dict
backtrack_thres
Type:float
tracks

Positions of tracks from the \(i\)-th to \((i+1)\)-th frame. Arrays are of shape N-by-2.

+------------>
|       tracks[:, 1]
|
|
v tracks[:, 0]
Type:list(numpy.array)
can_backtrack

Whether each track can be back-tracked to the previous frame. Arrays should be Boolean.

Type:list(numpy.array)
is_lost

Whether each track is lost in this frame. Arrays should be Boolean.

Type:list(numpy.array)
__init__(frames, pts, backtrack_thres=1, lk_params=None)[source]
Parameters:
  • frames (list(numpy.array)) – Frame images in order. Arrays are either H-by-W or H-by-W-by-3, and will be converted to grayscale.
  • pts (array_like) –

    Points to track in the first frame. Of shape N-by-2.

    +------------>
    |       pts[:, 1]
    |
    |
    v pts[:, 0]
    
  • backtrack_thres (float, optional) – Largest pixel deviation in the \(x\) or \(y\) direction of a successful backtrack.
  • lk_params (dict, optional) – Keyword parameters for cv2.calcOpticalFlowPyrLK().
run(constrain=None)[source]

Runs tracking.

Parameters:constrain (function, optional) – Function applied to tracks before being fed to the next round. It should take in an N-by-2 arrays as well as the current workspace (as a dictionary) and return another array.
vis(out_dir, marker_bgr=(0, 0, 255))[source]

Visualizes results.

Parameters:
  • out_dir (str) – Output directory.
  • marker_bgr (tuple, optional) – Marker BGR color.
Writes
  • Each frame with tracked points marked out.