syconn.processing

syconn.processing.axoness

syconn.processing.axoness.calc_distance2soma(graph, nodes)[source]

Calculates the distance to a soma node for each node and stores it inplace in node.data[‘dist2soma’]. Building depth first search graph at each node for sorted node ordering.

Parameters:
  • graph (graph of SKeletonAnnotation) –
  • nodes (SkeletonNodes) –
syconn.processing.axoness.predict_axoness_from_node_comments(anno)[source]

Exctracts axoness prediction from node comment for given contact site annotation.

Parameters:anno (SkeletonAnnotation) – Contact site
Returns:skeleton IDS, skeleton axoness
Return type:numpy.array, numpy.array
syconn.processing.axoness.predict_axoness_from_nodes(anno)[source]

Exctracts axoness prediction from nodes for given contact site annotation

Parameters:anno (SkeletonAnnotation) – contact site.
Returns:skeleton IDS, skeleton axoness
Return type:numpy.array, numpy.array

syconn.processing.cell_types

syconn.processing.cell_types.calc_neuron_feat(path, wd)[source]

Calculate neuron features using neuron class

Parameters:path (str) – path to mapped annotation kzip
Returns:cell type features, skeleton ID
Return type:numpy.array, numpy.array
syconn.processing.cell_types.calc_neuron_feat_star(params)[source]
syconn.processing.cell_types.draw_feat_hist(wd, k=15, classes=(0, 1, 2, 3), nb_bars=20)[source]

Draws the histgoram(s) of the most k important feature(s)

Parameters:
  • wd (str) – Path to working directory
  • k (int) – Number of features to be plotted
  • classes (tuple) – Class labels to evaluate
  • nb_bars (int) – Number of bars in histogram
syconn.processing.cell_types.find_cell_types_from_dict(wd, cell_type)[source]
Parameters:
  • wd (str) – Path to working directory
  • cell_type (int) – label (0 = EA, 1 = MSN, 2 = GP, 3 = INT)
Returns:

paths of cells of type cell_type.

Return type:

list of str

syconn.processing.cell_types.get_cell_type_classes_dict()[source]
Returns:dictionary from integer label to full cell name as string
Return type:dict
syconn.processing.cell_types.get_cell_type_labels()[source]

Cell type labels for HVC(0), LMAN(0), STN(0), MSN(1), GP(2), FS(3)

Returns:convetion dictionary for cell type labels (str), returns integer
Return type:dict
syconn.processing.cell_types.get_id_dict_from_skel_ids(skel_ids)[source]

Calc dictionary to get new label (from 0 to len(skel_paths) from skeleton ID.

Parameters:skel_ids (list of int) –
Returns:Dictionary to get new ID with old ID, dictionary to get old ID with new ID
Return type:dict, dict
syconn.processing.cell_types.load_cell_gt(skel_ids, wd)[source]

Load cell types of skel ids :param skel_ids: :type skel_ids: list of int :param wd: :type wd: str

Returns:cell type labels
Return type:np.array
syconn.processing.cell_types.load_celltype_feats(wd)[source]

Loads cell type feature and corresponding ids from dictionaries

Parameters:wd (str) – Path to working directory
Returns:cell type features
Return type:np.array
syconn.processing.cell_types.load_celltype_gt(wd, load_data=True, return_ids=False)[source]

Load ground truth of cell types

(HVC, LMAN, STN) => excitatory axons (0) (MSN) => medium spiny neuron (1) (GP) => pallidal-like neurons (2) (FS) => inhibitory interneuron (3)

Parameters:wd (str) – Path to working directory
Returns:cell type features, cell type labels
Return type:numpy.array, numpy.array
syconn.processing.cell_types.load_celltype_preds(wd)[source]

Loads cell type predictions and corresponding ids from dictionaries

Parameters:wd (str) – Path to working directory
Returns:cell ids, cell labels
Return type:np.array, np.array
syconn.processing.cell_types.load_celltype_probas(wd)[source]

Loads cell type probabilities and corresponding ids from dictionaries

Parameters:wd (str) – Path to working directory
Returns:cell ids, cell label probabilities
Return type:np.array, np.array
syconn.processing.cell_types.predict_celltype_label(wd)[source]

Predict celltyoe labels in working directory with pre-trained classifier in subfolder models/rf_celltypes/rf.pkl

Parameters:wd (str) – path to working directory
syconn.processing.cell_types.save_cell_type_clf(gt_path, clf_used='rf', load_data=True)[source]

Save axoness clf specified by clf_used to gt_directory.

Parameters:
  • gt_path (str) – path to cell type gt
  • clf_used (str) –
  • load_data (bool) –
syconn.processing.cell_types.save_cell_type_feats(wd)[source]

Saves cell type feature for type prediction

Parameters:wd (str) – Path to working directory
syconn.processing.cell_types.write_feats_importance(wd, load_data=True, clf_used='rf')[source]

Writes out feature importances and feature names to gt_dir

Parameters:
  • wd (str) –
  • load_data (bool) –
  • clf_used (str) –

syconn.processing.features

syconn.processing.features.assign_property2node(node, pred, property)[source]

Assign prediction of property to node

Parameters:
  • node (NewSkeletonNode) –
  • pred (prediction appropriate to property) –
  • property (property to change) –
syconn.processing.features.calc_prop_feat_dict(source, dist=6000)[source]

Calculates property feature

Parameters:
Returns:

  • dict, list of str, bool
  • Dictionary of property features, list of feature names, bool if spiness
  • feature are given

syconn.processing.features.celltype_axoness_feature(anno)[source]

Calculates axones feature of mapped sekeleton for cell type prediction. These include proportion of axon, dendrite and soma pathlengths and maximum degree of soma nodes.

Returns:axoness features
Return type:np.array (n x 4)
syconn.processing.features.get_obj_density(source, property='axoness_pred', value=1, obj='mito', return_abs_density=True)[source]

Calculate pathlength of nodes using edges

Parameters:
  • anno (list of SkeletonAnnotation) –
  • property (str) – e.g. ‘axoness_pred’
  • value (int) – value of property to check
  • obj (str) – mito/vc/sj
  • return_abs_density (bool) –
Returns:

Return type:

return: length in um

syconn.processing.features.majority_vote(anno, property='axoness', max_dist=6000)[source]

Smoothes (average using sliding window of 2 times max_dist and majority vote) property prediction in annotation, whereas for axoness somata are untouched.

Parameters:
  • anno (SkeletonAnnotation) –
  • property (str) – which property to average
  • max_dist (int) – maximum distance (in nm) for sliding window used in majority voting
syconn.processing.features.morphology_feature(source, max_nn_dist=6000)[source]

Calculates features for discrimination tasks of neurite identities, such as axon vs. dendrite or cell types classification. Estimated on interpolated skeleton nodes. Features are calculated with a sliding window approach for each node. Window is 2*max_nn_dist (nm).

Parameters:
  • source (str) – Path to anno or MappedSkeletonObject
  • max_nn_dist (float) – Radius in which neighboring nodes are found and used for calculating features in nm.
Returns:

two arrays of features for each node. number of nodes x 28 (22 radius feature and 6 object features), bool if spiness feature are given

Return type:

numpy.array, numpary.array, list of int, bool

syconn.processing.features.node_branch_end_distance(nml, dist)[source]

Set distances to next branch resp. end point for each node (distance is capped by given parameter dist) in .data dictionary of each node and returns values with node ids

Parameters:
Returns:

distances to nearest end/branch point, node ids

Return type:

np.array, np.array

syconn.processing.features.nodes_in_pathlength(anno, max_path_len)[source]

Find nodes reachable in max_path_len from source node, calculated for every node in anno.

Parameters:
  • anno (AnnotationObject) –
  • max_path_len (float) – Maximum distance from source node
Returns:

  • list of lists containing reachable nodes in max_path_len where
  • outer list has length len(anno.getNodes())

syconn.processing.features.objfeat2skelnode(node_coords, node_radii, node_ids, nearby_node_list, obj_dict, scaling)[source]

Calculate features of UltrastructuralDatasetObjects along Skeleton

Parameters:
  • node_coords (np.array) –
  • node_radii (np.array) –
  • node_ids (np.array) –
  • nearby_node_list (list of list of SkeletonNodes) –
  • obj_dict (UltrastructuralDataset) –
  • scaling (tuple) –
Returns:

The two features are absolute number of assigned objects and mean voxel size of the objects

Return type:

np.array (dimension nb_skelnodes x 2)

syconn.processing.features.pathlength_of_property(anno, property, value)[source]

Calculate pathlength of nodes with certain property value

Parameters:
  • anno (SkeletonAnnotation) – mapped cell tracing
  • property (str) – spiness / axoness
  • value (int) – classification result, e.g. 0, 1, 2
Returns:

length (in um)

Return type:

int

syconn.processing.features.propertyfeat2skelnode(node_list)[source]

Calculate nodewise radius feature

Parameters:node_list (list) – grouped nodes
Returns:number of nodes times 22 features, containing mean radius, sigma of radii, 20 hist features
Return type:np.array
syconn.processing.features.radfeat2skelnode(nearby_node_list)[source]

Calculate nodewise radius feature

Parameters:nearby_node_list (list of list of SkeletonNodes) – grouped tracing nodes
Returns:array of number of nodes times 22 features, containing mean radius, sigma of radii, 20 hist features, spinehead features and whether spiness features are returned
Return type:np.array, np.array, bool
syconn.processing.features.radius_feats_from_nodes(nodes, nb_bins=10, max_rad=5000)[source]

Calculates mean, std and histogram features

Parameters:
  • nodes (list of SkeletonNodes) –
  • nb_bins (int) – Number of bins for histogram features
  • max_rad (int) – maximum radius to plot on histogram x-axis
Returns:

radius features with dim. nb_bins+2

Return type:

np.array

syconn.processing.features.sj_per_spinehead(anno)[source]

Calculate number of sj per spinehead. Iterate over all mapped sj objects and find nearest skeleton node. If skeleton node has spiness prediction == 1 (spinehead) then increment counter of this node by one. After the loop sum over all counter and divide by the number of nodes which have at least one sj assigned.

:param : :type : param anno: SkeletonAnnotation

Returns:Average number of sj per spinehead (assumes there is no spinehead without sj)
Return type:float
syconn.processing.features.spiness_feats_from_nodes(nodes)[source]

Calculates spiness feats including abs. number of spineheads, mean and standard deviation (std) of spinehead size, mean spinehead probability and mean and std of spineneck lengths.

Parameters:nodes (list of SkeletonNodes) –
Returns:spiness features, dim. of 6
Return type:np.array
syconn.processing.features.update_property_feat_kzip(path2kzip, dist=6000)[source]

Recompute axoness feature of skeleton at path2kzip and writes it to .k.zip

Parameters:
  • path2kzip (str) – Path to mapped skeleton
  • dist (int) –
syconn.processing.features.update_property_feat_kzip_star(args)[source]

Helper function for update_property_feat_kzip

syconn.processing.learning_rfc

syconn.processing.learning_rfc.cell_classification(node_pred)[source]

Perform majority vote

Parameters:node_pred (np.array of int) – arbitrary array of integer
Returns:maximum occurring integer in array
Return type:int
syconn.processing.learning_rfc.feature_importance(rf, save_path=None)[source]

Plots feature importance of sklearn RandomForest

Parameters:
  • rf (RandomForestClassifier) –
  • save_path (str) –
syconn.processing.learning_rfc.fscore(rec, prec, beta=1.0)[source]

Calculates f-score with beta value

Parameters:
  • rec (np.array) – recall
  • prec (np.array) – precision
  • beta (float) – weighting of precision
Returns:

f-score

Return type:

np.array

syconn.processing.learning_rfc.init_clf(clf_used, params=None)[source]
syconn.processing.learning_rfc.load_csv2feat(fpath, prop='axoness')[source]

Load csvfile from kzip and return numpy array and list of header names. List line is supposed to be the probability prediction.

Parameters:
  • fpath (str) – Source file path
  • prop (str) – property which should be loaded
Returns:

features, feature names

Return type:

np.array, list of str

syconn.processing.learning_rfc.load_rfcs(rf_axoness_p, rf_spiness_p)[source]

Loads pickeled Random Forest Classifier for axoness and spiness. If path is not valid returns None

Parameters:
  • rf_axoness_p (str) – Path to pickeled axonnes rf directory
  • rf_spiness_p (str) – Path to pickeled spiness rf directory
Returns:

Return type:

RFC axoness, spiness or None

syconn.processing.learning_rfc.loo_proba(x, y, clf_used='rf', use_pca=False, params=None)[source]

Perform leave-one-out

Parameters:
  • x (np.array) – features
  • y (np.array) – labels
  • clf_used (str) – classifier
  • use_pca (bool) – perform principal component analysis on features x in advance
  • params (dict) – parameter for classifier
Returns:

class probability, hard classification

Return type:

np.array, np.array

syconn.processing.learning_rfc.novel_multiclass_prediction(f_scores, thresholds, probs)[source]
syconn.processing.learning_rfc.plot_corr(x, y, title='', xr=[-1, -1], yr=[-1, -1], save_path=None, nbins=5, xlabel='Size x', ylabel='Size y')[source]
syconn.processing.learning_rfc.plot_pr(precision, recall, title='', r=[0.67, 1.01], legend_labels=None, save_path=None, nbins=5, colorVals=None, xlabel='Recall', ylabel='Precision', l_pos='lower left', legend=True, r_x=[0.67, 1.01], ls=22)[source]
syconn.processing.learning_rfc.save_train_clf(X, y, clf_used, dir_path, use_pca=False, params=None)[source]

Train classifier specified by clf_used to dir_path. Train with features X and labels y

Parameters:
  • X (np.array) – features
  • y (np.array) – labels
  • clf_used (str) – ‘rf’ or ‘svm’ for RandomForest or SupportVectorMachine, respectively
  • dir_path (str) – directory where to save pkl files of clf
  • use_pca (bool) – flag if pca should be performed
  • params (dict) – parameters for classifier
syconn.processing.learning_rfc.write_feat2csv(fpath, feat_arr, feat_names=None)[source]

Writes array with column names to csv file

Parameters:
  • fpath (str) – Path to file
  • feat_arr (np.array) – feature array
  • feat_names (list of str) – feature names

syconn.processing.mapper

class syconn.processing.mapper.SkeletonMapper(source, dh, ix=None, soma=None, context_range=6000)[source]

Bases: object

Class to handle mapping of cell objects (mitochondria, vesicle clouds, synaptic clefts) to tracings. Mapping parameters are saved as attributes.

soma

SkeletonAnnotation – Soma tracing

old_anno

SkeletonAnnotation – original tracing where estimated cell radius is saved at each node

anno

SkeletonAnnotation – interpolated tracing skeleton for hull calculation

mitos/vc/sj

segmentationDataset – Dictionaries in which mapped cell objects are saved

ix

int – mapped skeleton id

write_obj_voxel

bool – write object voxel to kzip as binary file

annotate_object(objects, radius, method, objtype)[source]

Redirects mapping task to desired method-function

Parameters:
  • objects (UltrastructuralDataset) –
  • radius (int) – Radius of kd-tree in units of nm.
  • method (str) – either ‘hull’, ‘kd’ or ‘supervoxel’
  • objtype (string) – characterising object type
Returns:

mapped object ID’s

Return type:

list

annotate_objects(dh, radius=1200, method='hull', thresh=2.2, filter_size=(0, 0, 0), nb_neighbors=20, nb_hull_vox=500, neighbor_radius=220, detect_outlier=True, nb_rays=20, nb_voting_neighbors=100, max_dist_mult=1.4)[source]

Creates self.object with annotated objects as UltrastructuralDataset, where object is in {mitos, vc, sj}

Parameters:
  • dh (DataHandler) – object containing SegmentationDataObjects mitos, vc, sj
  • radius (int) – Radius in nm. Single integer if integer radius is for all objects the same. If list of three integer stick to ordering [mitos, vc, sj].
  • method (str) – Either ‘kd’ for fix radius or ‘hull’/’supervoxel’ if membrane is available.
  • thresh (float) – Denotes the factor which is multiplied with the maximum membrane probability. The resulting value is used as threshold after which the membrane is assumed to be existant.
  • filter_size (int) – List of integer for each object [mitos, vc, sj]
  • nb_neighbors (int) – minimum number of neighbors needed during outlier detection for a single hull point to survive.
  • nb_hull_vox (int) – Number of object hull voxels which are used to estimate spatial proximity to skeleton (inside or outside).
  • neighbor_radius (int) – Radius (nm) of ball in which to look for supporting hull voxels. Used during outlier detection.
  • detect_outlier (bool) – use outlier-detection if True.
  • nb_rays (int) – Number of rays send at each skeleton node (multiplied by a factor of 5). Defines the angle between two rays (=360 / nb_rays) in the orthogonal plane.
  • nb_voting_neighbors (int) – Number votes of skeleton hull voxels (membrane representation) for object-mapping. Used for vc and mitos during geometrical position estimation of object nodes.
  • max_dist_mult (float) – Multiplier for radius to estimate maximal distance of hull points to source node.
calc_myelinisation()[source]

Calculates myelinisation at each node and writes it to node.data[“myelin_pred”]

cset
get_plot_obj()[source]

Extracts coordinates from annotated SegmentationObjects

Returns:object-voxels for each object
Return type:np.array
hull_coords

Scaled hull coordinates of skeleton membrane

Returns:Coordinate each hull point
Return type:np.array
hull_normals

Normal for each hull point pointing outwards

Returns:Normal vector of each hull point pointing outwards
Return type:np.array
hull_sampling(thresh=2.2, nb_rays=20, nb_neighbors=20, neighbor_radius=220, detect_outlier=True, max_dist_mult=1.4)[source]

Calculates hull of tracing

Parameters:
  • thresh (float) – factor of maximum occurring prediction value after which membrane is triggered active.
  • nb_rays (int) – Number of rays send at each skeleton node (multiplied by a factor of 5). Defines the angle between two rays (=360 / nb_rays) in the orthogonal plane.
  • nb_neighbors (int) – minimum number of neighbors needed during outlier detection for a single hull point to survive.
  • neighbor_radius (int) – Radius of ball in which to look for supporting hull voxels. Used during outlier detection.
  • detect_outlier (bool) – use outlier-detection if True.
  • max_dist_mult (float) – Multiplier for radius to generate maximal distance of hull points to source node.
Returns:

Average radius per node in (9,9,20) corrected units estimated by rays propagated through Membrane prediction until threshold reached.

Return type:

numpy.array

merge_soma_tracing()[source]
predict_property(rf, prop, max_neck2endpoint_dist=3000, max_head2endpoint_dist=600)[source]

Predict property (axoness, spiness) of tracings

Parameters:
  • rf (RandomForestClassifier) –
  • prop (str) – property name
  • max_neck2endpoint_dist (int) –
  • max_head2endpoint_dist (int) –
property_features

Getter of property features, calculates axoness/spiness features if necessary

Returns:property features, if spiness feature are given
Return type:np.array, bool
skel_radius

Radius of membrane at each skeleton node

Returns:cell radius at self.nodes
Return type:np.array
write2kzip(path)[source]

Writes interpolated skeleton (and annotated objects) to nml at path. If self.write_obj_voxel flag is True a .txt file containing all object voxel with id is written in k.zip

Parameters:path (str) – Path to kzip destination
write2pkl(path)[source]

Writes MappedSkeleton object to .pkl file. Path is extracted from dh._datapath and MappedSkeleton ID.

Parameters:path (str) – Path to kzip destination
syconn.processing.mapper.calc_syn_dict(features, axoness_info, get_all=False)[source]

Creates dictionary of synapses. Keys are ids of pre cells and values are dictionaries of corresponding synapses with post cell ids.

Parameters:
  • features (np.array) – synapse feature
  • axoness_info (np.array) – string containing axoness information of cells
  • get_all (bool) – collect all contact sites
Returns:

synapse features, axoness information, connectivity, post synaptic cell ids, synapse predictions, axoness

Return type:

np.array, np.array, dict, np.array, np.array, dict

syconn.processing.mapper.cs_btw_annos(anno_a, anno_b, max_hull_dist, concom_dist)[source]

Computes contact sites between two annotation objects and returns hull points of both skeletons near contact site.

Parameters:
  • anno_a (SkeletonAnnotation) – Annotation object A
  • anno_b (SkeletonAnnotation) – Annotation object B
  • max_hull_dist (int) – Maximum distance between skeletons in nm
  • concom_dist (int) – maximum distance of connected components (nm)
Returns:

List of hull coordinates for each contact site

Return type:

list

syconn.processing.mapper.feature_valid_syns(cs_dir, only_sj=True, only_syn=True, all_contacts=False)[source]

Returns the features of valid synapses predicted by synapse rfc

Parameters:
  • cs_dir (str) – Path to computed contact sites.
  • only_sj (bool) – Return feature of all contact sites with mapped sj.
  • only_syn (bool) – Returns feature only if synapse was predicted
  • all_contacts (bool) – Use all contact sites for feature extraction
Returns:

features, array of contact site IDS, boolean array of synapse prediction

Return type:

np.array (n x f), np.array (n x 1), np.array (n x 1)

syconn.processing.mapper.get_radii_hull(args)[source]
syconn.processing.mapper.max_nodes_in_path(anno, source_node, max_number)[source]

Find specified number of nodes along skeleton from source node (BFS).

Parameters:
Returns:

Tracing nodes up to certain distance from source node

Return type:

list of SkeletonNodes

syconn.processing.mapper.node_id2key(segdataobject, node_ids, filter_size)[source]

Maps list indices in node_ids to keys of SegmentationObjects. Filters objects bigger than filter_size.

:param : :type : param segdataobject: UltrastructuralDataset of object type currently processed :param : :type : param node_ids: List of list containing annotated object ids for each node :param : :type : param filter_size: int minimum number of voxels of object

Returns:objects keys
Return type:list
syconn.processing.mapper.outlier_detection(point_list, min_num_neigh, radius)[source]

Finds hull outlier using point density criterion

Parameters:
  • point_list (list) – List of coordinates
  • min_num_neigh (int) – Minimum number of neighbors, s.t. hull-point survives.
  • radius (int) – Radius in nm to look for neighbors
Returns:

Cleaned point cloud

Return type:

numpy.array

syconn.processing.mapper.prepare_syns_btw_annos(pairwise_paths, dest_path, max_hull_dist=60, concom_dist=300)[source]

Checks pairwise for contact sites between annotation objects found at paths in nml_list. Adds sj, vc and nearest skeleton nodes to found contact sites. Writes ‘contact_sites.nml’ to nml-path containing contact sites of all nml’s.

Parameters:
  • pairwise_paths (list of str) – List of pairwise paths to nml’s
  • dest_path (str) – Path to directory where to store result of synapse mapping
  • max_hull_dist (float) – maximum distance between skeletons in nm
  • concom_dist (float) – Maximum distance of connected components (nm)
syconn.processing.mapper.read_pair_cs(pair_path)[source]

Helper function to collect pairwise contact site information. Extracts axoness prediction.

Parameters:pair_path (str) – path to pairwise contact site kzip
Returns:annotation object without contact site hull voxel
Return type:SkeletonAnnotation
syconn.processing.mapper.readout_cs_info(args)[source]

Helper function of feature_valid_syns

Parameters:args (tuple) – path to file and queue
Returns:synapse features, contact site ID
Return type:np.array, str
syconn.processing.mapper.similarity_check(skel_a, skel_b)[source]

If absolute number of identical nodes is bigger then certain threshold return similar.

Parameters:
Returns:

skel_a and skel_b are similar

Return type:

bool

syconn.processing.mapper.similarity_check_star(params)[source]

Helper function

syconn.processing.mapper.syn_btw_anno_pair(params)[source]

Get synapse information between two mapped annotation objects. Details are written to pairwise nml (all contact sites between pairs contained) and to nml for each contact site.

Parameters:
  • params (list) – [path_a, path_b, max_hull_dist, concom_dist]
  • path_a (str) – path to mapped annotation object
  • path_b (str) – path to mapped annotation object
  • max_hull_dist (float) – maximum distance between skeletons (nm)
  • concom_dist (float) – maximum distance of connected components (nm)
syconn.processing.mapper.translate_dense_tracings()[source]

syconn.processing.spiness

syconn.processing.spiness.assign_neck(anno, max_head2endpoint_dist=600, max_neck2endpoint_dist=3000)[source]

Assign nodes between spine head node and first node with degree 2 as spine necks inplace. head (1) and shaft (0). Key for prediction is “spiness_pred”

Parameters:
  • anno (SkeletonAnnotation) – mapped cell tracing
  • max_head2endpoint_dist (int) – maximum distance between spine head and endpoint on graph
  • max_neck2endpoint_dist (int) – maximum distance between spine neck and endpoint on graph
syconn.processing.spiness.collect_spineheads(anno, dist=6000)[source]

Searches nodes in annotation for nodes with spinehead prediciton and returns them as list (no copy!).

syconn.processing.synapticity

syconn.processing.synapticity.calc_syn_feature(gt_samples, ignore_keys=['Barrier', 'Skel'], new_data=False, test_data=False, detailed_cs_dir='/lustre/pschuber/m_consensi_rr/nml_obj/contact_sites_new3/')[source]

collect synpase feature of all contact sites. Additionally, ground truth values if test_data is True. :param gt_samples: List of paths to contact sites :param ignore_keys: Which keys to ignore in string if collecting GT value :param new_data: outdated :param test_data: whether to collect GT value :param detailed_cs_dir: path to folder containing the contact sites :return:

syconn.processing.synapticity.helper_load_sj_feat(args)[source]
Parameters:args
Returns:
syconn.processing.synapticity.pairwise_syn_feature_calc(args)[source]

Helper function for calc_syn_feature. Collects feature of contact site. :param args: path to contact sites, list of ingore keys, path to contact_sites folder, q of multiprocess manager, bool new data(old), bool test_data (whether to collect gt_value) :return: synapse feature, ground truth value

syconn.processing.synapticity.parse_synfeature_from_node(node)[source]

Parases values of features from string. :param node: node with values of feature_names :return: array of float values for each feature

syconn.processing.synapticity.parse_synfeature_from_txt(txt)[source]

Parases values of features from string. :param txt: String with values of feature_names, like ‘area1.5_dist2.3’ :return: array of float values for each feature

syconn.processing.synapticity.save_synapse_clf(gt_path, clf_used='rf')[source]

Save synapse clf specified by clf_used to gt_directory. :param gt_path: str to directory of synapse ground truth :param clf_used: ‘rf’ or ‘svm’

syconn.processing.synapticity.syn_sign_prediction(voxels, kd_path_sym, kd_path_asym, threshold=0.25)[source]

syconn.processing.initialization

syconn.processing.initialization.initialize_cset(kd, home_path, chunksize)[source]

Initializes a ChunkDataset

Parameters:
  • kd (KnossosDataset) – KnossosDataset instance of the corresponding raw data
  • home_path (str) – path to main folder
  • chunksize (np.array) – size of each chunk; typically in the order of ~ [1000, 1000, 500]
Returns:

cset

Return type:

ChunkDataset

syconn.processing.objectextraction

syconn.processing.objectextraction.apply_merge_list(cset, chunk_list, filename, hdf5names, merge_list_dict, debug, suffix='', qsub_pe=None, qsub_queue=None)[source]

Applies merge list to all chunks

Parameters:
  • cset (chunkdataset instance) –
  • chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
  • filename (str) – Filename of the prediction in the chunkdataset
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • merge_list_dict (dictionary) – mergedict for each hdf5name
  • debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
  • suffix (str) – Suffix for the intermediate results
  • qsub_pe (str or None) – qsub parallel environment
  • qsub_queue (str or None) – qsub queue
syconn.processing.objectextraction.calculate_chunk_numbers_for_box(cset, offset, size)[source]

Calculates the chunk ids that are (partly) contained it the defined volume

Parameters:
  • cset (ChunkDataset) –
  • offset (np.array) – offset of the volume to the origin
  • size (np.array) – size of the volume
Returns:

  • chunk_list (list) – chunk ids
  • dictionary (dict) – with reverse mapping

syconn.processing.objectextraction.concatenate_mappings(cset, filename, hdf5names, debug=False, suffix='')[source]

Combines all map dicts

Parameters:
  • cset (chunkdataset instance) –
  • filename (str) – Filename of the prediction in the chunkdataset
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
  • suffix (str) – Suffix for the intermediate results
syconn.processing.objectextraction.create_datasets_from_objects(cset, filename, hdf5names, debug=False, suffix='', qsub_pe=None, qsub_queue=None)[source]

Create dataset instance from objects

Parameters:
  • cset (chunkdataset instance) –
  • filename (str) – Filename of the prediction in the chunkdataset
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
  • suffix (str) – Suffix for the intermediate results
  • qsub_pe (str or None) – qsub parallel environment
  • qsub_queue (str or None) – qsub queue
syconn.processing.objectextraction.create_objects_from_voxels(cset, filename, hdf5names, granularity=15, debug=False, suffix='', qsub_pe=None, qsub_queue=None)[source]

Creates object instances from extracted voxels

Parameters:
  • cset (chunkdataset instance) –
  • filename (str) – Filename of the prediction in the chunkdataset
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • granularity (int) – Defines granularity for partitioning data for multiprocessing
  • debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
  • suffix (str) – Suffix for the intermediate results
  • qsub_pe (str or None) – qsub parallel environment
  • qsub_queue (str or None) – qsub queue
syconn.processing.objectextraction.extract_voxels(cset, filename, hdf5names, debug=False, chunk_list=None, suffix='', qsub_pe=None, qsub_queue=None)[source]

Extracts voxels for each component id

Parameters:
  • cset (chunkdataset instance) –
  • filename (str) – Filename of the prediction in the chunkdataset
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
  • debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
  • suffix (str) – Suffix for the intermediate results
  • qsub_pe (str or None) – qsub parallel environment
  • qsub_queue (str or None) – qsub queue
syconn.processing.objectextraction.from_ids_to_objects(cset, filename, hdf5names, chunk_list=None, debug=False, offset=None, size=None, suffix='', qsub_pe=None, qsub_queue=None)[source]

Main function for the object extraction step; combines all needed steps

Parameters:
  • cset (chunkdataset instance) –
  • filename (str) – Filename of the prediction in the chunkdataset
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
  • debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
  • offset (np.array) – offset of the volume to the origin
  • size (np.array) – size of the volume
  • suffix (str) – Suffix for the intermediate results
  • qsub_pe (str or None) – qsub parallel environment
  • qsub_queue (str or None) – qsub queue
syconn.processing.objectextraction.from_probabilities_to_objects(cset, filename, hdf5names, overlap='auto', sigmas=None, thresholds=None, chunk_list=None, debug=False, swapdata=0, label_density=array([ 1., 1., 1.]), offset=None, size=None, membrane_filename=None, membrane_kd_path=None, hdf5_name_membrane=None, suffix='', qsub_pe=None, qsub_queue=None)[source]

Main function for the object extraction step; combines all needed steps

Parameters:
  • cset (chunkdataset instance) –
  • filename (str) – Filename of the prediction in the chunkdataset
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • overlap (str or np.array) – Defines the overlap with neighbouring chunks that is left for later processing steps; if ‘auto’ the overlap is calculated from the sigma and the stitch_overlap (here: [1., 1., 1.])
  • sigmas (list of lists or None) – Defines the sigmas of the gaussian filters applied to the probability maps. Has to be the same length as hdf5names. If None no gaussian filter is applied
  • thresholds (list of float) – Threshold for cutting the probability map. Has to be the same length as hdf5names. If None zeros are used instead (not recommended!)
  • chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
  • debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
  • swapdata (boolean) – If true an x-z swap is applied to the data prior to processing
  • label_density (np.array) – Defines the density of the data. If the data was downsampled prior to saving; it has to be interpolated first before processing due to alignment issues with the coordinate system. Two-times downsampled data would have a label_density of [2, 2, 2]
  • offset (np.array) – offset of the volume to the origin
  • size (np.array) – size of the volume
  • membrane_filename (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Filename of the prediction in the chunkdataset. The threshold is currently set at 0.4.
  • membrane_kd_path (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Path to the knossosdataset containing a membrane segmentation. The threshold is currently set at 0.4.
  • hdf5_name_membrane (str) – When using the membrane_filename this key has to be given to access the data in the saved chunk
  • suffix (str) – Suffix for the intermediate results
  • qsub_pe (str or None) – qsub parallel environment
  • qsub_queue (str or None) – qsub queue
syconn.processing.objectextraction.from_probabilities_to_objects_parameter_sweeping(cset, filename, hdf5names, nb_thresholds, overlap='auto', sigmas=None, chunk_list=None, swapdata=0, label_density=array([ 1., 1., 1.]), offset=None, size=None, membrane_filename=None, membrane_kd_path=None, hdf5_name_membrane=None, qsub_pe=None, qsub_queue=None)[source]

Sweeps over different thresholds. Each objectextraction resutls are saved in a seperate folder, all intermediate steps are saved with a different suffix

Parameters:
  • cset (chunkdataset instance) –
  • filename (str) – Filename of the prediction in the chunkdataset
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • nb_thresholds (integer) – number of thresholds and therefore runs of objectextractions to do; the actual thresholds are equally spaced
  • overlap (str or np.array) – Defines the overlap with neighbouring chunks that is left for later processing steps; if ‘auto’ the overlap is calculated from the sigma and the stitch_overlap (here: [1., 1., 1.])
  • sigmas (list of lists or None) – Defines the sigmas of the gaussian filters applied to the probability maps. Has to be the same length as hdf5names. If None no gaussian filter is applied
  • chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
  • swapdata (boolean) – If true an x-z swap is applied to the data prior to processing
  • label_density (np.array) – Defines the density of the data. If the data was downsampled prior to saving; it has to be interpolated first before processing due to alignment issues with the coordinate system. Two-times downsampled data would have a label_density of [2, 2, 2]
  • offset (np.array) – offset of the volume to the origin
  • size (np.array) – size of the volume
  • membrane_filename (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Filename of the prediction in the chunkdataset. The threshold is currently set at 0.4.
  • membrane_kd_path (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Path to the knossosdataset containing a membrane segmentation. The threshold is currently set at 0.4.
  • hdf5_name_membrane (str) – When using the membrane_filename this key has to be given to access the data in the saved chunk
  • suffix (str) – Suffix for the intermediate results
  • qsub_pe (str) – qsub parallel environment name
  • qsub_queue (str or None) – qsub queue name
syconn.processing.objectextraction.gauss_threshold_connected_components(cset, filename, hdf5names, overlap='auto', sigmas=None, thresholds=None, chunk_list=None, debug=False, swapdata=False, label_density=array([ 1., 1., 1.]), membrane_filename=None, membrane_kd_path=None, hdf5_name_membrane=None, fast_load=False, suffix='', qsub_pe=None, qsub_queue=None)[source]

Extracts connected component from probability maps 1. Gaussian filter (defined by sigma) 2. Thresholding (defined by threshold) 3. Connected components analysis

In case of vesicle clouds (hdf5_name in [“p4”, “vc”]) the membrane segmentation is used to cut connected vesicle clouds across cells apart (only if membrane segmentation is provided).

Parameters:
  • cset (chunkdataset instance) –
  • filename (str) – Filename of the prediction in the chunkdataset
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • overlap (str or np.array) – Defines the overlap with neighbouring chunks that is left for later processing steps; if ‘auto’ the overlap is calculated from the sigma and the stitch_overlap (here: [1., 1., 1.])
  • sigmas (list of lists or None) – Defines the sigmas of the gaussian filters applied to the probability maps. Has to be the same length as hdf5names. If None no gaussian filter is applied
  • thresholds (list of float) – Threshold for cutting the probability map. Has to be the same length as hdf5names. If None zeros are used instead (not recommended!)
  • chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
  • debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
  • swapdata (boolean) – If true an x-z swap is applied to the data prior to processing
  • label_density (np.array) – Defines the density of the data. If the data was downsampled prior to saving; it has to be interpolated first before processing due to alignment issues with the coordinate system. Two-times downsampled data would have a label_density of [2, 2, 2]
  • membrane_filename (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Filename of the prediction in the chunkdataset. The threshold is currently set at 0.4.
  • membrane_kd_path (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Path to the knossosdataset containing a membrane segmentation. The threshold is currently set at 0.4.
  • hdf5_name_membrane (str) – When using the membrane_filename this key has to be given to access the data in the saved chunk
  • fast_load (boolean) – If true the data of chunk is blindly loaded without checking for enough offset to compute the overlap area. Faster, because no neighbouring chunk has to be accessed since the default case loads th overlap area from them.
  • suffix (str) – Suffix for the intermediate results
  • qsub_pe (str or None) – qsub parallel environment
  • qsub_queue (str or None) – qsub queue
Returns:

  • results_as_list (list) – list containing information about the number of connected components in each chunk
  • overlap (np.array)
  • stitch overlap (np.array)

syconn.processing.objectextraction.make_merge_list(hdf5names, stitch_list, max_labels)[source]

Creates a merge list from a stitch list by mapping all connected ids to one id

Parameters:
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • stitch_list (dictionary) – Contains pairs of overlapping component ids for each hdf5name
  • dictionary (max_labels) – Contains the number of different component ids for each hdf5name
Returns:

  • merge_dict (dictionary) – mergelist for each hdf5name
  • merge_list_dict (dictionary) – mergedict for each hdf5name

syconn.processing.objectextraction.make_stitch_list(cset, filename, hdf5names, chunk_list, stitch_overlap, overlap, debug, suffix='', qsub_pe=None, qsub_queue=None)[source]

Creates a stitch list for the overlap region between chunks

Parameters:
  • cset (chunkdataset instance) –
  • filename (str) – Filename of the prediction in the chunkdataset
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
  • overlap (np.array) – Defines the overlap with neighbouring chunks that is left for later processing steps
  • stitch_overlap (np.array) – Defines the overlap with neighbouring chunks that is left for stitching
  • debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
  • suffix (str) – Suffix for the intermediate results
  • qsub_pe (str or None) – qsub parallel environment
  • qsub_queue (str or None) – qsub queue
  • Returns
  • --------
  • stitch_list (list) – list of overlapping component ids
syconn.processing.objectextraction.make_unique_labels(cset, filename, hdf5names, chunk_list, max_nb_dict, chunk_translator, debug, suffix='', qsub_pe=None, qsub_queue=None)[source]

Makes labels unique across chunks

Parameters:
  • cset (chunkdataset instance) –
  • filename (str) – Filename of the prediction in the chunkdataset
  • hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
  • chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
  • max_nb_dict (dictionary) – Maps each chunk id to a integer describing which needs to be added to all its entries
  • chunk_translator (boolean) – Remapping from chunk ids to position in chunk_list
  • debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
  • suffix (str) – Suffix for the intermediate results
  • qsub_pe (str or None) – qsub parallel environment
  • qsub_queue (str or None) – qsub queue

syconn.processing.watershed_segmentation