Skip to main content
eScholarship
Open Access Publications from the University of California

UC Davis

UC Davis Electronic Theses and Dissertations bannerUC Davis

A Graph-based Framework for Multiple Change-point Detection

No data is associated with this publication.
Abstract

We study the problem of multiple change-point detection in high-dimensional data and non-Euclidean data with graph-based statistics. With the emergence of more complex data with multiple change-points, traditional change-point detection methods for low-dimensional data are not suitable anymore. We first propose a nonparametric multiple change-point detection framework using graph-based statistics. The framework is a two-step procedure. In the first step, we combine generalized edge count scan statistics with wild binary segmentation or seeded binary segmentation to search for a pool of candidate change-points. We then prune the candidate change-points through a novel goodness-of-fit statistic in the second step. Numerical studies show that this new framework outperforms existing methods under a wide range of settings. The resulting change-points can further be arranged hierarchically based on the goodness-of-fit statistic.

Next, to further improve the detection accuracy under frequent changes scenarios and pure mean or covariance changes scenarios, we incorporate max-type edge-count scan statistics in the first step. In the second step, a new goodness-of-fit statistic built on max-type two-sample test statistics with a stepwise algorithm is used for model selection.

Furthermore, we consider an important application of multiple change-point detection on Neuropixels data. Neuropixels is a new tool in neuroscience allowing the recording of brain neuronal activities in high resolution for a long period of time. The large size of Neuropixels data and its non-stationarity make it challenging for statistical analysis. We propose a nonparametric method for detecting multiple change-point for this type of data. Change-point analysis can be served as a preliminary step for further statistical modeling. The proposed method combines max-type edge count scan statistics and wild binary segmentation to search for change-points in parallel, greatly reducing the computation time required for long sequences. The method is demonstrated by an application to Neuropixels data recorded from an awake mouse in nine brain regions for 20 minutes.

Main Content

This item is under embargo until August 1, 2024.