Posts by Collection

code

S.A.S.S.A.F.R.A.S.

    A small script that looks for unread Google Scholar Alerts emails in your Gmail account and saves each paper in a Google Spreadsheet as:
    Title/ Authors - Journal/ Google Scholar link/ Date/ number of Alerts that contained the paper

TDA summary for gene-expression: Mapper Algorithm in 2D

    Code for reproducing results in the paper "Topological gene-expression networks recapitulate brain anatomy and function".
    We present a pipeline based on the Mapper algorithm, a topological simplification tool, to produce and analyze genes co-expression data.
    The code contains an implementation of the Mapper algorithm with 2 filters and to compute the agreement edge density matrices for the optimal parameters.

pre-prints

publications

GIT: Navigating features: a topologically informed chart of electromyographic features space.

    The success of biological signal pattern recognition depends crucially on theselection of relevant features. Across signal and imaging modalities, a largenumber of features have been proposed, leading to feature redundancy and the need for optimal feature set identification. A further complication is that,due to the inherent biological variability, even the same classification problemon different datasets can display variations in the respective optimal sets,casting doubts on the generalizability of relevant features. Here, we approachthis problem by leveraging topological tools to create charts of features spaces.These charts highlight feature sub-groups that encode similar information(and their respective similarities) allowing for a principled and interpretablechoice of features for classification and analysis. Using multiple electro-myographic (EMG) datasets as a case study, we use this feature chartto identify functional groups among 58 state-of-the-art EMG features, and toshow that they generalize across three different forearm EMG datasets obtained from able-bodied subjects during hand and finger contractions.We find that these groups describe meaningful non-redundant information,succinctly recapitulating information about different regions of featurespace. We then recommend representative features from each group basedon maximum class separability, robustness and minimum complexity.

GIT: Topological analysis of data

    An editorial where we briefly present the TDA paradigm and some applications, in order to highlight its relevance to the data science community.

GIT: The shape of collaborations.

    A study on the structure of scientific collaborations using simplicial descriptions of publications. We extend the concept of triadic closure to simplicial complexes and introduce a new way of dealing with large simplex sizes when computing homology.

talks

teaching

GIT: Algebraic Topology - MATH 354

    The course is an intrduction to basic Algebraic Topology concepts, like homotopy, homology and cohomology. We will look into application to Data Science at the end of the semester.

GIT: Data Science I - STAT 287

    Combining data, computation, and inferential thinking, data science is redefining how people and organizations solve challenging problems and understand their world. Modeled on Berkely’s Data 100 course, this course introduces basic concepts in data science.
    In this class, we explore key areas of data science including question formulation, data collection and cleaning, visualization, statistical inference, predictive modeling, and decision making. Through a strong emphasizes on data centric computing, quantitative critical thinking, and exploratory data analysis this class covers key principles and techniques of data science. These include languages for transforming, querying and analyzing data; algorithms for machine learning methods including regression, classification and clustering; principles behind creating informative data visualizations; statistical concepts of measurement error and prediction; and techniques for scalable data processing.