xiuminglib package¶
Subpackages¶
Submodules¶
xiuminglib.camera module¶
-
class
xiuminglib.camera.
PerspCam
(name='cam', f_pix=533.33, im_res=(256, 256), loc=(1, 1, 1), lookat=(0, 0, 0), up=(0, 1, 0))[source]¶ Bases:
object
Perspective camera in 35mm format.
This is not an OpenGL/Blender camera (where \(+x\) points right, \(+y\) up, and \(-z\) into the viewing direction), but rather a “CV camera” (where \(+x\) points right, \(+y\) down, and \(+z\) into the viewing direction). See more in
ext_mat
.Because we mostly consider just the camera and the object, we assume the object coordinate system (the “local system” in Blender) aligns with (and hence, is the same as) the world coordinate system (the “global system” in Blender).
Note
- Sensor width of the 35mm format is actually 36mm.
- This class assumes unit pixel aspect ratio (i.e., \(f_x = f_y\)) and no skewing between the sensor plane and optical axis.
- The active sensor size may be smaller than
sensor_w
andsensor_h
, depending onim_res
. Seesensor_w_active
andsensor_h_active
. aov
,sensor_h
, andsensor_w
are hardware properties, having nothing to do withim_res
.
-
__init__
(name='cam', f_pix=533.33, im_res=(256, 256), loc=(1, 1, 1), lookat=(0, 0, 0), up=(0, 1, 0))[source]¶ Parameters: - name (str, optional) – Camera name.
- f_pix (float, optional) – Focal length in pixel.
- im_res (array_like, optional) – Image height and width in pixels.
- loc (array_like, optional) – Camera location in object space.
- lookat (array_like, optional) – Where the camera points to in object space, so default \((0, 0, 0)\) is the object center.
- up (array_like, optional) – Vector in object space that, when projected, points upward in image.
-
aov
¶ Vertical and horizontal angles of view in degrees.
Type: numpy.ndarray
-
backproj
(depth, fg_mask=None, bg_fill=0.0, depth_type='plane', space='object')[source]¶ Backprojects a depth map to 3D points.
Resolution of the depth map may be different from
im_h
andim_w
:im_h
andim_w
decide the image coordinate bounds, and the depth resolution decides number of steps.Parameters: - depth (numpy.ndarray) – Depth map.
- fg_mask (numpy.ndarray, optional) – Backproject only pixels falling inside this foreground mask. Its values should be logical.
- bg_fill (flaot, optional) – Filler value for background region.
- depth_type (str, optional) – Plane or ray depth.
- space (str, optional) – In which space the backprojected points are
specified:
'object'
or'camera'
.
Returns: \(xyz\) map.
Return type:
-
blender_rot_euler
¶ Euler rotations in degrees.
Type: numpy.ndarray
-
ext_mat
¶ \(3\times 4\) object-to-camera extrinsics matrix, i.e., rotation and translation that transform a point from object space to camera space.
Two coordinate systems involved: object space “obj” and camera space following the computer vision convention “cv”, where \(+x\) horizontally points right (to align with pixel coordinates), \(+y\) vertically points down, and \(+z\) is the look-at direction (because right-handed).
Type: numpy.ndarray
-
ext_mat_4x4
¶ Padding \([0, 0, 0, 1]\) to bottom of the \(3\times 4\) extrinsics matrix to make it invertible.
Type: numpy.ndarray
-
gen_rays
(spp=1)[source]¶ Generates ray directions in object space, with the ray origin being the camera location.
Parameters: spp (int, optional) – Samples (or number of rays) per pixel. Must be a perfect square \(S^2\) due to uniform, deterministic supersampling. Returns: An \(H\times W\times S^2\times 3\) array of ray directions. Return type: numpy.ndarray
-
get_cam2obj
(cam_type='cv', square=False)[source]¶ Inverse of
get_obj2cam()
.One example use: calling this with
cam_type='blender'
gives Blender’scam.matrix_world
.
-
get_obj2cam
(cam_type='cv', square=False)[source]¶ Gets the object-to-camera transformation matrix.
Parameters: Returns: \(3\times 4\) or \(4\times 4\) object-to-camera transformation matrix.
Return type:
-
int_mat
¶ \(3\times 3\) intrinsics matrix.
Type: numpy.ndarray
-
loc
¶ Camera location in object space.
Type: numpy.ndarray
-
lookat
¶ Where in object space the camera points to.
Type: numpy.ndarray
-
proj
(pts, space='object')[source]¶ Projects 3D points to 2D.
Parameters: - pts (array_like) – 3D point(s) of shape \(N\times 3\) or \(3\times N\), or of length 3.
- space (str, optional) – In which space these points are specified:
'object'
or'camera'
.
Returns: Vertical and horizontal coordinates of the projections, following:
+-----------> dim1 | | | v dim0
Return type: array_like
-
proj_mat
¶ \(3\times 4\) projection matrix, derived from intrinsics and extrinsics.
Type: numpy.ndarray
-
resize
(new_h=None, new_w=None)[source]¶ Updates the camera intrinsics according to the new size.
Parameters:
-
sensor_fit_horizontal
¶ Whether field of view angle fits along the horizontal or vertical direction.
Type: bool
-
set_from_mitsuba
(xml_path)[source]¶ Sets camera according to a Mitsuba XML file.
Parameters: xml_path (str) – Path to the XML file.
-
to_dict
(app=None)[source]¶ Converts this camera to a dictionary of its properties.
Parameters: app (str, optional) – For what application are we converting? Accepted are None
and'blender'
.Returns: This camera as a dictionary. Return type: dict
-
up
¶ Up vector, the vector in object space that, when projected, points upward on image plane.
Type: numpy.ndarray
xiuminglib.const module¶
-
class
xiuminglib.const.
Dir
[source]¶ Bases:
object
-
data
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data'¶
-
mstatus
= '/tmp/machine-status/runtime'¶
-
tmp
= '/tmp/'¶
-
-
class
xiuminglib.const.
Path
[source]¶ Bases:
object
-
armadillo
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/armadillo.ply'¶
-
buddha
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip.ply'¶
-
buddha_prefix
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip'¶
-
buddha_res2
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip_res2.ply'¶
-
buddha_res3
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip_res3.ply'¶
-
buddha_res4
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/buddha/happy_vrip_res4.ply'¶
-
bunny
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper.ply'¶
-
bunny_prefix
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper'¶
-
bunny_res2
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper_res2.ply'¶
-
bunny_res3
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper_res3.ply'¶
-
bunny_res4
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/bunny/bun_zipper_res4.ply'¶
-
cameraman
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/images/cameraman.png'¶
-
checker
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/textures/checker.png'¶
-
cpustatus
= '/tmp/cpu/machine_status.txt'¶
-
dragon
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip.ply'¶
-
dragon_prefix
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip'¶
-
dragon_res2
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip_res2.ply'¶
-
dragon_res3
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip_res3.ply'¶
-
dragon_res4
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/dragon/dragon_vrip_res4.ply'¶
-
gpustatus
= '/tmp/gpu/{machine_name}'¶
-
lenna
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/images/lenna.png'¶
-
lpips_weights
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/lpips/net-lin_alex_v0.1.pb'¶
-
open_sans_regular
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/fonts/open-sans/OpenSans-Regular.ttf'¶
-
teapot
= '/home/docs/checkouts/readthedocs.org/user_builds/xiuminglib/checkouts/latest/data/models/teapot.obj'¶
-
xiuminglib.decor module¶
Decorators that wrap a function.
If the function is defined in the file where you want to use the decorator, you can decorate the function at define time:
@decorator
def somefunc():
return
If the function is defined somewhere else, do:
from numpy import mean
mean = decorator(mean)
-
xiuminglib.decor.
colossus_interface
(somefunc)[source]¶ Wraps black-box functions to read from and write to Google Colossus.
Because it’s hard (if possible at all) to figure out which path is input, and which is output, when the input function is black-box, this is a “best-effort” decorator (see below for warnings).
This decorator works by looping through all the positional and keyword parameters, copying CNS paths that exist prior to
somefunc
execuation to temporary local locations, runningsomefunc
and writing its output to local locations, and finally copying local paths that get modified bysomefunc
to their corresponding CNS locations.Warning
Therefore, if
somefunc
’s output already exists (e.g., you are re-running the function to overwrite the old result), it will be copied to local, overwritten bysomefunc
locally, and finally copied back to CNS. This doesn’t lead to wrong behaviors, but is inefficient.This decorator doesn’t depend on Blaze, as it’s using the
fileutil
CLI, rather thangoogle3.pyglib.gfile
. This is convenient in at least two cases:- You are too lazy to use Blaze, want to run tests quickly on your local machine, but need access to CNS files.
- Your IO is more complex than what
with gfile.Open(...) as h:
can do (e.g., a Blender function importing an object from a path), in which case you have to copy the CNS file to local (“local” here could also mean a Borglet’s local).
This interface generally works with resolved paths (e.g.,
/path/to/file
), but not with wildcard paths (e.g.,/path/to/???
), sicne it’s hard (if possible at all) to guess what your function tries to do with such wildcard paths.- Writes
- Input files copied from Colossus to
$TMP/
. - Output files generated to
$TMP/
, to be copied to Colossus.
- Input files copied from Colossus to
xiuminglib.img module¶
-
xiuminglib.img.
alpha_blend
(arr1, alpha, arr2=None)[source]¶ Alpha-blends two arrays, or masks one array.
Parameters: - arr1 (numpy.ndarray) – Input array.
- alpha (numpy.ndarray) – Alpha map whose values are \(\in [0,1]\).
- arr2 (numpy.ndarray) – Input array. If
None
,arr1
will be blended with an all-zero array, equivalent to maskingarr1
.
Returns: Blended array of type
float
.Return type:
-
xiuminglib.img.
binarize
(im, threshold=None)[source]¶ Binarizes images.
Parameters: - im (numpy.ndarray) – Image to binarize. Of any integer type (
uint8
,uint16
, etc.). If H-by-W-by-3, will be converted to grayscale and treated as H-by-W. - threshold (float, optional) – Threshold for binarization.
None
means midpoint of thedtype
.
Returns: Binarized image. Of only 0’s and 1’s.
Return type: - im (numpy.ndarray) – Image to binarize. Of any integer type (
-
xiuminglib.img.
compute_gradients
(im)[source]¶ Computes magnitudes and orientations of image gradients.
With Scharr operators:
[ 3 0 -3 ] [ 3 10 3] [10 0 -10] and [ 0 0 0] [ 3 0 -3 ] [-3 -10 -3]
Parameters: im (numpy.ndarray) – H-by-W if single-channel (e.g., grayscale) or H-by-W-by-C if multi-channel (e.g., RGB) images. Gradients are computed independently for each of the C channels. Returns: - grad_mag (numpy.ndarray) – Magnitude image of the gradients.
- grad_orient (numpy.ndarray) – Orientation image of the
gradients (in radians).
y ^ pi/2 | pi | --------+--------> 0 -pi | x | -pi/2
Return type: tuple
-
xiuminglib.img.
denormalize_float
(arr, uint_type='uint8')[source]¶ De-normalizes the input
float
array such that \(1\) becomes the targetuint
maximum.Parameters: - arr (numpy.ndarray) – Input array of type
float
. - uint_type (str, optional) – Target
uint
type.
Returns: De-normalized array of the target type.
Return type: - arr (numpy.ndarray) – Input array of type
-
xiuminglib.img.
find_local_extrema
(im, want_maxima, kernel_size=3)[source]¶ Finds local maxima or minima in an image.
Parameters: - im (numpy.ndarray) – H-by-W if single-channel (e.g., grayscale) or H-by-W-by-C for multi-channel (e.g., RGB) images. Extrema are found independently for each of the C channels.
- want_maxima (bool) – Whether maxima or minima are wanted.
- kernel_size (int, optional) – Side length of the square window under consideration. Must be larger than 1.
Returns: Binary map indicating if each pixel is a local extremum.
Return type:
-
xiuminglib.img.
gamma_correct
(im, gamma=2.2)[source]¶ Applies gamma correction to an
uint
image.Parameters: - im (numpy.ndarray) – H-by-W if single-channel (e.g., grayscale) or
H-by-W-by-C multi-channel (e.g., RGB)
uint
images. - gamma (float, optional) – Gamma value \(< 1\) shifts image towards the darker end of the spectrum, while value \(> 1\) towards the brighter.
Returns: Gamma-corrected image.
Return type: - im (numpy.ndarray) – H-by-W if single-channel (e.g., grayscale) or
H-by-W-by-C multi-channel (e.g., RGB)
-
xiuminglib.img.
grid_query_img
(im, query_x, query_y, method='bilinear')[source]¶ Grid queries an image via interpolation.
If you want to grid query unstructured data, consider
grid_query_unstruct()
.This function uses either bilinear interpolation that allows you to break big matrices into patches and work locally, or bivariate spline interpolation that fits a global spline (so memory-intensive) and shows global effects.
Parameters: - im (numpy.ndarray) – H-by-W or H-by-W-by-C rectangular grid of data. Each of C channels is interpolated independently.
- query_x (array_like) – \(x\) coordinates of the queried rectangle,
e.g.,
np.arange(10)
for a 10-by-10 grid (hence, this should not be generated bynumpy.meshgrid()
or similar functions). - query_y (array_like) –
\(y\) coordinates, following this convention:
+---------> query_x | | | v query_y
- method (str, optional) – Interpolation method:
'spline'
or'bilinear'
.
Returns: Interpolated values at query locations, of shape
(len(query_y), len(query_x))
for single-channel input or(len(query_y), len(query_x), im.shape[2])
for multi-channel input.Return type:
-
xiuminglib.img.
grid_query_unstruct
(uvs, values, grid_res, method=None)[source]¶ Grid queries unstructured data given by coordinates and their values.
If you are looking to grid query structured data, such as an image, check out
grid_query_img()
.This function interpolates values on a rectangular grid given some sparse, unstrucured samples. One use case is where you have some UV locations and their associated colors, and you want to “paint the colors” on a UV canvas.
Parameters: - uvs (numpy.ndarray) – N-by-2 array of UV coordinates where we have
values (e.g., colors). See
xiuminglib.blender.object.smart_uv_unwrap()
for the UV coordinate convention. - values (numpy.ndarray) – N-by-M array of M-D values at the N UV locations, or N-array of scalar values at the N UV locations. Channels are interpolated independently.
- grid_res (array_like) – Resolution (height first; then width) of the query grid.
- method (dict, optional) –
Dictionary of method-specific parameters. Implemented methods and their default parameters:
# Default method = { 'func': 'griddata', # Which SciPy function to call. 'func_underlying': 'linear', # Fed to `griddata` as the `method` parameter. 'fill_value': (0,), # black # Will be used to fill in pixels outside the convex hulls # formed by the UV locations, and if `max_l1_interp` is # provided, also the pixels whose interpolation is too much # of a stretch to be trusted. In the context of "canvas # painting," this will be the canvas' base color. 'max_l1_interp': np.inf, # trust/accept all interpolations # Maximum L1 distance, which we can trust in interpolation, # to pixels that have values. Interpolation across a longer # range will not be trusted, and hence will be filled with # `fill_value`. }
method = { 'func': 'rbf', # Which SciPy function to call. 'func_underlying': 'linear', # Fed to `Rbf` as the `method` parameter. 'smooth': 0, # no smoothing # Fed to `Rbf` as the `smooth` parameter. }
Returns: Interpolated values at query locations, of shape
grid_res
for single-channel input or(grid_res[0], grid_res[1], values.shape[2])
for multi-channel input.Return type: - uvs (numpy.ndarray) – N-by-2 array of UV coordinates where we have
values (e.g., colors). See
-
xiuminglib.img.
linear2srgb
(im, clip=False)[source]¶ Converts an image from linear RGB values to sRGB.
Parameters: - im (numpy.ndarray) – Of type
float
, and all pixels must be \(\in [0, 1]\). - clip (bool, optional) – Whether to clip values to \([0,1]\).
Defaults to
False
.
Returns: Converted image in sRGB.
Return type: - im (numpy.ndarray) – Of type
-
xiuminglib.img.
normalize_uint
(arr)[source]¶ Normalizes the input
uint
array such that itsdtype
maximum becomes \(1\).Parameters: arr (numpy.ndarray) – Input array of type uint
.Returns: Normalized array of type float
.Return type: numpy.ndarray
-
xiuminglib.img.
remove_islands
(im, min_n_pixels, connectivity=4)[source]¶ Removes small islands of pixels from a binary image.
Parameters: - im (numpy.ndarray) – Input binary image. Of only 0’s and 1’s.
- min_n_pixels (int) – Minimum island size to keep.
- connectivity (int, optional) – Definition of “connected”: either 4 or 8.
Returns: Output image with small islands removed.
Return type:
-
xiuminglib.img.
resize
(arr, new_h=None, new_w=None, method='cv2')[source]¶ Resizes an image, with the option of maintaining the aspect ratio.
Parameters: - arr (numpy.ndarray) – Image to binarize. If multiple-channel, each channel is resized independently.
- new_h (int, optional) – Target height. If
None
, will be calculated according to the target width, assuming the same aspect ratio. - new_w (int, optional) – Target width. If
None
, will be calculated according to the target height, assuming the same aspect ratio. - method (str, optional) – Accepted values:
'cv2'
,'tf'
, and'pil'
.
Returns: Resized image.
Return type:
-
xiuminglib.img.
rgb2lum
(im)[source]¶ Converts RGB to relative luminance (if input is linear RGB) or luma (if input is gamma-corrected RGB).
Parameters: im (numpy.ndarray) – RGB array of shape (..., 3)
.Returns: Relative luminance or luma array. Return type: numpy.ndarray
-
xiuminglib.img.
srgb2linear
(im, clip=False)[source]¶ Converts an image from sRGB values to linear RGB.
Parameters: - im (numpy.ndarray) – Of type
float
, and all pixels must be \(\in [0, 1]\). - clip (bool, optional) – Whether to clip values to \([0,1]\).
Defaults to
False
.
Returns: Converted image in linear RGB.
Return type: - im (numpy.ndarray) – Of type
-
xiuminglib.img.
tonemap
(hdr, method='gamma', gamma=2.2)[source]¶ Tonemaps an HDR image.
Parameters: - hdr (numpy.ndarray) – HDR image.
- method (str, optional) – Values accepted:
'gamma'
and'reinhard'
. - gamma (float, optional) – Gamma value used if method is
'gamma'
.
Returns: Tonemapped image \(\in [0, 1]\).
Return type:
xiuminglib.imprt module¶
xiuminglib.interact module¶
-
xiuminglib.interact.
ask_to_proceed
(msg, level='warning')[source]¶ Pauses there to ask the user whether to proceed.
Parameters:
-
xiuminglib.interact.
print_attrs
(obj, excerpts=None, excerpt_win_size=60, max_recursion_depth=None)[source]¶ Prints all attributes, recursively, of an object.
Parameters: - obj (object) – Object in which we search for the attribute.
- excerpts (str or list(str), optional) – Print only excerpts containing
certain attributes.
None
means to print all. - excerpt_win_size (int, optional) – How many characters get printed around a match.
- max_recursion_depth (int, optional) – Maximum recursion depth.
None
means no limit.
xiuminglib.linalg module¶
-
xiuminglib.linalg.
angle_between
(vec1, vec2, radian=True)[source]¶ Computes the angle between two vectors.
Parameters: - vec1 (array_like) – Vector 1.
- vec2 –
- radian (bool, optional) – Whether to use radians.
Returns: The angle \(\in [0,\pi]\).
Return type:
-
xiuminglib.linalg.
calc_refl_vec
(h, l)[source]¶ Calculates the reflection vector given the half vector.
Parameters: - h (array_like) – Half vector as a 3-array.
- l (array_like) – “Incident” vector (pointing outwards from the surface point), as a 3-array.
Returns: Reflection vector as a 3-array.
Return type:
-
xiuminglib.linalg.
is_identity
(mat, eps=None)[source]¶ Checks if a matrix is an identity matrix.
If the input is not even square,
False
is returned.Parameters: - mat (numpy.ndarray) – Input matrix.
- eps (float, optional) – Numerical tolerance for equality.
None
meansnp.finfo(mat.dtype).eps
.
Returns: Whether the input is an identity matrix.
Return type:
-
xiuminglib.linalg.
is_symmetric
(mat, eps=None)[source]¶ Checks if a matrix is symmetric.
If the input is not even square,
False
is returned.Parameters: - mat (numpy.ndarray) – Input matrix.
- eps (float, optional) – Numerical tolerance for equality.
None
meansnp.finfo(mat.dtype).eps
.
Returns: Whether the input is symmetric.
Return type:
-
xiuminglib.linalg.
normalize
(vecs, axis=0)[source]¶ Normalizes vectors.
Parameters: - vecs (array_like) – 1D array for a single vector, 2D array for multiple vectors, 3D array for an “image” of vectors, etc.
- axis (int, optional) – Along which axis normalization is done.
Returns: Normalized vector(s) of the same shape as input.
Return type:
-
xiuminglib.linalg.
project_onto
(pts, basis)[source]¶ Projects points onto a basis vector.
Parameters: - pts (array_like) – 1D array for one vector; 2D N-by-M array for N M-D points.
- basis (array_like) – 1D M-array specifying which basis vector to project to.
Returns: Projected point(s) of the same shape.
Return type:
xiuminglib.log module¶
-
xiuminglib.log.
get_logger
(level=None)[source]¶ Creates a logger for functions in the library.
Parameters: level (str, optional) – Logging level. Defaults to logging.INFO
.Returns: Logger created. Return type: logging.Logger
xiuminglib.metric module¶
-
class
xiuminglib.metric.
Base
(dtype)[source]¶ Bases:
object
The base metric.
-
dtype
¶ Data type, with which data dynamic range is derived.
Type: numpy.dtype
-
__call__
(im1, im2, **kwargs)[source]¶ Parameters: - im1 (numpy.ndarray) – An image of shape H-by-W, H-by-W-by-1, or H-by-W-by-3.
- im2 –
Returns: The metric computed.
Return type:
-
__init__
(dtype)[source]¶ Parameters: dtype (str or numpy.dtype) – Data type, from which dynamic range will be derived.
-
-
class
xiuminglib.metric.
LPIPS
(dtype, weight_pb=None)[source]¶ Bases:
xiuminglib.metric.Base
The Learned Perceptual Image Patch Similarity (LPIPS) metric (lower is better).
Project page: https://richzhang.github.io/PerceptualSimilarity/
Note
This implementation assumes the minimum value allowed is \(0\), so data dynamic range becomes the maximum value allowed.
-
dtype
¶ Data type, with which data dynamic range is derived.
Type: numpy.dtype
-
lpips_func
¶ The LPIPS network packed into a function.
Type: tf.function
-
__call__
(im1, im2)[source]¶ Parameters: - im1 –
- im2 –
Returns: LPIPS computed (lower is better).
Return type:
-
__init__
(dtype, weight_pb=None)[source]¶ Parameters: - dtype (str or numpy.dtype) – Data type, from which maximum allowed will be derived.
- weight_pb (str, optional) – Path to the network weight protobuf.
Defaults to the bundled
net-lin_alex_v0.1.pb
.
-
-
class
xiuminglib.metric.
PSNR
(dtype)[source]¶ Bases:
xiuminglib.metric.Base
Peak Signal-to-Noise Ratio (PSNR) in dB (higher is better).
If the inputs are RGB, they are first converted to luma (or relative luminance, if the inputs are not gamma-corrected). PSNR is computed on the luma.
-
__call__
(im1, im2, mask=None)[source]¶ Parameters: - im1 –
- im2 –
- mask (numpy.ndarray, optional) – An H-by-W logical array indicating pixels that contribute to the computation.
Returns: PSNR in dB.
Return type:
-
-
class
xiuminglib.metric.
SSIM
(dtype)[source]¶ Bases:
xiuminglib.metric.Base
The (multi-scale) Structural Similarity Index (SSIM) \(\in [0,1]\) (higher is better).
If the inputs are RGB, they are first converted to luma (or relative luminance, if the inputs are not gamma-corrected). SSIM is computed on the luma.
xiuminglib.os module¶
-
xiuminglib.os.
call
(cmd, cwd=None, wait=True, quiet=False)[source]¶ Executes a command in shell.
Parameters: - cmd (str) – Command to be executed.
- cwd (str, optional) – Directory to execute the command in.
None
means current directory. - wait (bool, optional) – Whether to block until the call finishes.
- quiet (bool, optional) – Whether to print out the output stream (if any) and error stream (if error occured).
Returns: - retcode (int) – Command exit code. 0 means a successful
call. Always
None
if not waiting for the command to finish. - stdout (str) – Standard output stream. Always
None
if not waiting. - stderr (str) – Standard error stream. Always
None
if not waiting.
Return type:
-
xiuminglib.os.
cp
(src, dst, cns_parallel_copy=10)[source]¶ Copies files, possibly from/to the Google Colossus Filesystem.
Parameters:
-
xiuminglib.os.
exists_isdir
(path)[source]¶ Determines whether a path exists, and if so, whether it is a file or directory.
Supports Google Colossus (CNS) paths by using
gfile
(preferred for speed) or thefileutil
CLI.Parameters: path (str) – A path. Returns: - exists (bool) – Whether the path exists.
- isdir (bool) – Whether the path is a file or directory.
None
if the path doesn’t exist.
Return type: tuple
-
xiuminglib.os.
make_exp_dir
(directory, param_dict, rm_if_exists=False)[source]¶ Makes an experiment output folder by hashing the experiment parameters.
Parameters: - Writes
- The experiment parameters in
<directory>/<hash>/param.json
.
- The experiment parameters in
Returns: The experiment output folder just made. Return type: str
-
xiuminglib.os.
makedirs
(directory, rm_if_exists=False)[source]¶ Wraps
os.makedirs()
to support removing the directory if it alread exists.Google Colossus-compatible: it tries to use
gfile
first for speed. This will fail if Blaze is not used, in which case it then falls back to usingfileutil
CLI as external process calls.Parameters:
-
xiuminglib.os.
open_file
(path, mode)[source]¶ Opens a file.
Supports Google Colossus if
gfile
can be imported.Parameters: Returns: File handle that can be used as a context.
-
xiuminglib.os.
rm
(path)[source]¶ Removes a file or recursively a directory, with Google Colossus compatibility.
Parameters: path (str) –
-
xiuminglib.os.
sortglob
(directory, filename='*', ext=None, ext_ignore_case=False)[source]¶ Globs and then sorts filenames, possibly ending with multiple extensions, in a directory.
Supports Google Colossus, by using
gfile
(preferred for speed) or thefileutil
CLI when Blaze is not used (hence,gfile
unavailable).Parameters: - directory (str) – Directory to glob, e.g.,
'/path/to/'
. - filename (str or tuple(str), optional) – Filename pattern excluding
extensions, e.g.,
'img*'
. - ext (str or tuple(str), optional) – Extensions of interest, e.g.,
('png', 'jpg')
.None
means no extension, useful for folders or files with no extension. - ext_ignore_case (bool, optional) – Whether to ignore case for extensions.
Returns: Sorted list of files globbed.
Return type: - directory (str) – Directory to glob, e.g.,
xiuminglib.sig module¶
-
xiuminglib.sig.
dct_1d_bases
(n)[source]¶ Generates 1D discrete cosine transform (DCT) bases.
Bases are rows of \(Y\), which is orthogonal: \(Y^TY=YY^T=I\). The forward process (analysis) is \(X=Yx\), and the inverse (synthesis) is \(x=Y^{-1}X=Y^TX\). See
main()
for example usages and how this produces the same results asscipy.fftpack.dct()
(withtype=2
andnorm='ortho'
).Parameters: n (int) – Signal length. Returns: Matrix whose \(i\)-th row, when dotted with signal (column) vector, gives the coefficient for the \(i\)-th DCT component. Of shape (n, n)
.Return type: numpy.ndarray
-
xiuminglib.sig.
dct_2d_bases
(h, w)[source]¶ Generates bases for 2D discrete cosine transform (DCT).
Bases are given in two matrices \(Y_h\) and \(Y_w\). See
dct_1d_bases()
for their properties. Note that \(Y_w\) has already been transposed (hence, \(Y_hxY_w\) instead of \(Y_hxY_w^T\) below).Input image \(x\) should be transformed by both matrices (i.e., along both dimensions). Specifically, the analysis process is \(X=Y_hxY_w\), and the synthesis process is \(x=Y_h^TXY_w^T\). See
main()
for example usages.Parameters: - h (int) – Image height.
- w –
Returns: - dct_mat_h (numpy.ndarray) – DCT matrix \(Y_h\)
transforming rows of the 2D signal. Of shape
(h, h)
. - dct_mat_w (numpy.ndarray) – \(Y_w\) transformingcolumns. Of shape
(w, w)
.
Return type:
-
xiuminglib.sig.
dct_2d_bases_vec
(h, w)[source]¶ Generates bases stored in a single matrix, along whose height 2D frequencies get raveled.
Using the “vectorization + Kronecker product” trick: \(\operatorname{vec}(Y_hxY_w)=\left(Y_w^T\otimes Y_h\right) \operatorname{vec}(x)\). So unlike
dct_2d_bases()
, this function generates a single matrix \(Y=Y_w^T\otimes Y_h\), whose \(k\)-th row is the flattened \((i, j)\)-th basis, where \(k=wi+j\).Input image \(x\) can be transformed with a single matrix multiplication. Specifically, the analysis process is \(X=Y \operatorname{vec}(x)\), and the synthesis process is \(x= \operatorname{unvec}(Y^TX)\). See
main()
for examples.Warning
If you want to reconstruct the signal with only some (i.e., not all) bases, do not slice those rows out from \(Y\) and use only their coefficients. Instead, you should use the full \(Y\) matrix and set to zero the coefficients for the unused frequency components. See
main()
for examples.Parameters: - h (int) – Image height.
- w –
Returns: Matrix with flattened bases as rows. The \(k\)-th row, when
numpy.reshape()
’ed into(h, w)
, is the :math:` (i, j)`-th frequency component, where \(k=wi+j\). Of shape(h * w, h * w)
.Return type:
-
xiuminglib.sig.
dft_1d_bases
(n)[source]¶ Generates 1D discrete Fourier transform (DFT) bases.
Bases are rows of \(Y\), which is unitary (\(Y^HY=YY^H=I\), where \(Y^H\) is the conjugate transpose) and symmetric. The forward process (analysis) is \(X=Yx\), and the inverse (synthesis) is \(x=Y^{-1}X=Y^HX\). See
main()
for example usages.Parameters: n (int) – Signal length. Returns: Matrix whose \(i\)-th row, when dotted with signal (column) vector, gives the coefficient for the \(i\)-th Fourier component. Of shape (n, n)
.Return type: numpy.ndarray
-
xiuminglib.sig.
dft_2d_bases
(h, w)[source]¶ Generates bases for 2D discrete Fourier transform (DFT).
Bases are given in two matrices \(Y_h\) and \(Y_w\). See
dft_1d_bases()
for their properties. Note that \(Y_w\) has already been transposed.Input image \(x\) should be transformed by both matrices (i.e., along both dimensions). Specifically, the analysis process is \(X=Y_hxY_w\), and the synthesis process is \(x=Y_h^HXY_w^H\). See
main()
for example usages and how this produces the same results asnumpy.fft.fft2()
(withnorm='ortho'
).See also
From
numpy.fft
–A[1:n/2]
contains the positive-frequency terms, andA[n/2+1:]
contains the negative-frequency terms, in order of decreasingly negative frequency. For an even number of input points,A[n/2]
represents both positive and negative Nyquist frequency, and is also purely real for real input. For an odd number of input points,A[(n-1)/2]
contains the largest positive frequency, whileA[(n+1)/2]
contains the largest negative frequency.Parameters: - h (int) – Image height.
- w –
Returns: - dft_mat_h (numpy.ndarray) – DFT matrix \(Y_h\)
transforming rows of the 2D signal. Of shape
(h, h)
. - dft_mat_w (numpy.ndarray) – \(Y_w\) transforming
columns. Of shape
(w, w)
.
Return type:
-
xiuminglib.sig.
dft_2d_bases_vec
(h, w)[source]¶ Generates bases stored in a single matrix, along whose height 2D frequencies get raveled.
Using the “vectorization + Kronecker product” trick: \(\operatorname{vec}(Y_hxY_w)=\left(Y_w^T\otimes Y_h\right) \operatorname{vec}(x)\). So unlike
dft_2d_bases()
, this function generates a single matrix \(Y=Y_w^T\otimes Y_h\), whose \(k\)-th row is the flattened \((i, j)\)-th basis, where \(k=wi+j\).Input image \(x\) can be transformed with a single matrix multiplication. Specifically, the analysis process is \(X=Y\operatorname{vec}(x)\), and the synthesis process is \(x=\operatorname{unvec}(Y^HX)\). See
main()
for examples.Parameters: - h (int) – Image height.
- w –
Returns: Complex matrix with flattened bases as rows. The \(k\)-th row, when
numpy.reshape()
’ed into(h, w)
, is the \((i, j)\)-th frequency component, where \(k=wi+j\). Of shape(h * w, h * w)
.Return type:
-
xiuminglib.sig.
dft_2d_freq
(h, w)[source]¶ Gets 2D discrete Fourier transform (DFT) sample frequencies.
Parameters: - h (int) – Image height.
- w –
Returns: - freq_h (numpy.ndarray) – Sample frequencies, in cycles
per pixel, along the height dimension. E.g., if
freq_h[i, j] == 0.5
, then the(i, j)
-th component repeats every 2 pixels along the height dimension. - freq_w
Return type:
-
xiuminglib.sig.
get_extrema
(arr, top=True, n=1, n_std=None)[source]¶ Gets top (or bottom) N value(s) from an M-D array, with the option to ignore outliers.
Parameters: Returns: - ind (tuple) – Indices that give the extrema, M-tuple of arrays of N integers.
- val (numpy.ndarray) – Extremum values, i.e.,
arr[ind]
.
Return type:
-
xiuminglib.sig.
pca
(data_mat, n_pcs=None, eig_method='scipy.sparse.linalg.eigsh')[source]¶ Performs principal component (PC) analysis on data.
Via eigendecomposition of covariance matrix. See
main()
for example usages, including reconstructing data with top K PCs.Parameters: - data_mat (array_like) – Data matrix of N data points in the M-D space, of shape M-by-N, where each column is a point.
- n_pcs (int, optional) – Number of top PCs requested.
None
means \(M-1\). - eig_method (str, optional) – Method for eigendecomposition of the
symmetric covariance matrix:
'numpy.linalg.eigh'
or'scipy.sparse.linalg.eigsh'
.
Returns: - pcvars (numpy.ndarray) – PC variances (eigenvalues of covariance matrix) in descending order.
- pcs (numpy.ndarray) – Corresponding PCs (normalized
eigenvectors), of shape M-by-
n_pcs
. Each column is a PC. - projs (numpy.ndarray) – Data points centered and then
projected to the
n_pcs
-D PC space. Of shapen_pcs
-by-N. Each column is a point. - data_mean (numpy.ndarray) – Mean that can be used to recover raw data. Of length M.
Return type:
-
xiuminglib.sig.
sh_bases_real
(l, n_lat, coord_convention='colatitude-azimuth', _check_orthonormality=False)[source]¶ Generates real spherical harmonics (SHs).
See
main()
for example usages, including how to do both analysis and synthesis the SHs.Not accurate when
n_lat
is too small. E.g., orthonormality no longer holds when discretization is too coarse (smalln_lat
), as numerical integration fails to approximate the continuous integration.Parameters: - l (int) – Up to which band (starting form 0). The number of harmonics is \((l+1)^2\). In other words, all harmonics within each band (\(-l\leq m\leq l\)) are used.
- n_lat (int) – Number of discretization levels of colatitude (for
colatitude-azimuth convention; \([0, \pi]\)) or latitude (for
latitude-longitude convention; \([-\frac{\pi}{2},
\frac{\pi}{2}]\)). With the same step size,
n_azimuth
will be twice as big, since azimuth (in colatitude-azimuth convention; \([0, 2\pi]\)) or latitude (in latitude-longitude convention; \([-\pi, \pi]\)) spans \(2\pi\). - coord_convention (str, optional) –
Coordinate system convention to use:
'colatitude-azimuth'
or'latitude-longitude'
. Colatitude-azimuth vs. latitude-longitude convention:3D ^ z (colat = 0; lat = pi/2) | (azi = 3pi/2; | lng = -pi/2) ---------+---------> y (azi = pi/2; ,'| lng = pi/2) ,' | (colat = pi/2, azi = 0; x | (colat = pi; lat = -pi/2) lat = 0, lng = 0) 2D (0, 0) (pi/2, 0) +-----------> (0, 2pi) ^ lat | azi | | | | (0, -pi) -------+-------> (0, pi) v colat | lng (pi, 0) | (-pi/2, 0)
- _check_orthonormality (bool, optional, internal) – Whether to check orthonormal or not.
Returns: - ymat (numpy.ndarray) – Matrix whose rows are spherical
harmonics as generated by
scipy.special.sph_harm()
. When dotted with flattened image (column) vector weighted byareas_on_unit_sphere
, the \(i\)-th row gives the coefficient for the \(i\)-th harmonics, where \(i=l(l+1)+m\). The input signal (in the form of 2D image indexed by two angles) should be flattened withnumpy.ndarray.ravel()
, in row-major order: the row index varies the slowest, and the column index the quickest. Of shape((l + 1) ** 2, 2 * n_lat ** 2)
. - areas_on_unit_sphere (numpy.ndarray) – Area of the unit
sphere covered by each sample point. This is proportional to
sine of colatitude and has nothing to do with azimuth/longitude.
Used as weights for discrete summation to approximate continuous
integration. Necessary in SH analysis. Flattened also in
row-major order. Of length
n_lat * (2 * n_lat)
.
Return type:
xiuminglib.tracker module¶
-
class
xiuminglib.tracker.
LucasKanadeTracker
(frames, pts, backtrack_thres=1, lk_params=None)[source]¶ Bases:
object
Lucas Kanade Tracker.
-
pts
¶ Type: numpy.array
-
tracks
¶ Positions of tracks from the \(i\)-th to \((i+1)\)-th frame. Arrays are of shape N-by-2.
+------------> | tracks[:, 1] | | v tracks[:, 0]
Type: list(numpy.array)
-
can_backtrack
¶ Whether each track can be back-tracked to the previous frame. Arrays should be Boolean.
Type: list(numpy.array)
-
__init__
(frames, pts, backtrack_thres=1, lk_params=None)[source]¶ Parameters: - frames (list(numpy.array)) – Frame images in order. Arrays are either H-by-W or H-by-W-by-3, and will be converted to grayscale.
- pts (array_like) –
Points to track in the first frame. Of shape N-by-2.
+------------> | pts[:, 1] | | v pts[:, 0]
- backtrack_thres (float, optional) – Largest pixel deviation in the \(x\) or \(y\) direction of a successful backtrack.
- lk_params (dict, optional) – Keyword parameters for
cv2.calcOpticalFlowPyrLK()
.
-