High-frequency market manipulation detection with a Markov-modulated Hawkes process

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

This work focuses on a self-exciting point process defined by a Hawkes-like intensity and a switching mechanism based on a hidden Markov chain. Previous works in such a setting assume constant intensities between consecutive events. We extend the model to general Hawkes excitation kernels that are piecewise constant between events. We develop an expectation-maximization algorithm for the statistical inference of the Hawkes intensities parameters as well as the state transition probabilities. The numerical convergence of the estimators is extensively tested on simulated data. Using high-frequency cryptocurrency data on a top centralized exchange, we apply the model to the detection of anomalous bursts of trades. We benchmark the goodness-of-fit of the model with the Markov-modulated Poisson process and demonstrate the relevance of the model in detecting suspicious activities.

Similar Papers
  • Research Article
  • Cite Count Icon 62
  • 10.1016/0165-1684(94)00088-h
State duration modelling in hidden Markov models
  • Jan 1, 1995
  • Signal Processing
  • S.V Vaseghi

State duration modelling in hidden Markov models

  • Book Chapter
  • Cite Count Icon 7
  • 10.5772/14993
Continuous Hidden Markov Models for Depth Map-Based Human Activity Recognition
  • Apr 19, 2011
  • Zia Uddin + 1 more

There is an enormous volume of literature on the applications of Hidden Markov Models (HMMs) to a broad range of pattern recognition tasks. The first practical application of HMMs is much based on the work of Rabiner et al (Lawrence & Rabiner, 1989) for speech recognition. Since then, HMMs have been extensively used in various scientific fields such as computational biology, biomedical signal interpretation, image classification and segmentation, etc. An HMM can be described as a stochastic finite-state automation that can be used to model time sequential data. In general, there are four basic parts involved in the HMM: namely states, initial state distribution, state transition matrix, and state observation matrix. A state represents a property or condition that an HMM might have at a particular time. Initial state distribution indicates each state probability of an HMM at the time of starting the modeling procedure of an event. The state transition matrix represents the probabilities among the states. The observation matrix contains the observation probabilities from each state. Once the architecture of an HMM is defined with the four essential components, training of the HMM is required. To train, the first step is to classify features into a specific number of clusters, generating a codebook. Then from the codebook, symbol sequences are generated through vector quantization. These symbol sequences later are used to model spatiotemporal patterns in an HMM. The number of states and initial state distribution of HMM are empirically determined in general. The state transition and observation probabilities from each state are usually initialized with uniform distributions and later adapted according to the training symbol sequences. In practice, there are some wellestablished training algorithms available to automatically optimize the parameters of the HMM. The Baum–Welch (Baum et al., 1970) training procedure is a standard algorithm which uses the Maximum Likelihood Estimation (MLE) criterion. In this training algorithm, the training symbol sequences are used to estimate the HMM parameters. Finally, a testing sequence gets analyzed by the trained HMMs to be recognized. In an HMM, the underlying processes are usually not observable, but they can be observed through another set of stochastic processes that produces continuous or discrete observations (Lawrence & Rabiner, 1989), which lead to discrete or continuous HMMs respectively. In the discrete HMMs, the observation sequences are vector-quantized using a codebook to select discrete symbols. Though the discrete symbols for the observations

  • Book Chapter
  • Cite Count Icon 9
  • 10.1093/oso/9780198526155.003.0047
The Markov Modulated Poisson Process and Markov Poisson Cascade with Applications to Web Traffic Modeling
  • Jul 3, 2003
  • Steven L Scott + 1 more

A Markov modulated Poisson Process (MMPP) is a Poisson process whose rate varies according to a Markov process. The nonhomogeneous MMPP developed in this article is a natural model for point processes whose events combine irregular bursts of activity with predictable (e.g. daily and hourly) patterns. We show how the MMPP may be viewed as a superposition of unobserved Poisson processes that are activated and deactivated by an unobserved Markov process. The MMPP is a continuous time model which may also be viewed as a discretely indexed nonstationary hidden Markov model by viewing intervals between events as a sequence of dependent random variables. The HMM representation allows one to probabilistically reconstruct the latent Markov and Poisson processes using a set of forward-backward recursions. The recursions allow MMPP parameters to be estimated either by an EM algorithm or by a rapidly mixing Markov chain Monte Carlo algorithm which uses the recursions for data augmentation. The Markov-Poisson cascade (MPC) is an MMPP whose underlying Markov process obeys certain restrictions which uniquely order the event rates for the observed process. The ordering avoids a possible label switching issue without slowing down the rapidly mixing algorithms we use to implement the model. We apply the MPC to a data set containing click rate data for individual computer users browsing through the World Wide Web. Because the complete data posterior distribution for the MPC is a product of exponential family distributions we are able to incorporate data from multiple users into a hierarchical model using existing methods from hierarchical Poisson regression.

  • Research Article
  • Cite Count Icon 1
  • 10.1007/s00477-007-0209-z
UXO target area identification with hidden Markov models
  • Jan 16, 2008
  • Stochastic Environmental Research and Risk Assessment
  • Sean A Mckenna

Site characterization activities at potential unexploded ordnance (UXO) sites rely on sparse sampling collected as geophysical surveys along strip transects. From these samples, the locations of target areas, those regions on the site where the geophysical anomaly density is significantly above the background density, must be identified. A target area detection approach using a hidden Markov model (HMM) is developed here. HMM’s use stationary transition probabilities from one state to another for steps between adjacent locations as well as the probability of any particular observation occurring given each possible underlying state. The approach developed here identifies the transition probabilities directly from the conceptual site model (CSM) created as part of the UXO site characterization process. A series of simulations examine the ability of the HMM approach to simultaneously determine the target area locations within each transect and to estimate the unknown anomaly intensity within the identified target area. The HMM results are compared to those obtained using a simpler target detection approach that considers the background anomaly density to be defined by a Poisson distribution and each location to be independent of any adjacent location. Results show that the HMM approach is capable of accurately identifying the target locations with limited false positive identifications when both the background and target are intensities are known. The HMM approach is relatively robust to changes in the initial estimate of the target anomaly intensity and is capable of identifying target locations and the corresponding target anomaly intensity when this intensity is approximately 60% higher than the background intensity at intensities that are representative of actual field sites. Application to data collected from a wide area assessment field site show that the HMM approach identifies the area of the site with elevated anomaly intensity with few false positives. This field site application also shows that the HMM results are relatively robust to changes in the transect width.

  • Research Article
  • Cite Count Icon 14
  • 10.1016/j.entcs.2015.10.022
Adapting Hidden Markov Models for Online Learning
  • Nov 1, 2015
  • Electronic Notes in Theoretical Computer Science
  • Tiberiu Chis + 1 more

Adapting Hidden Markov Models for Online Learning

  • Research Article
  • Cite Count Icon 24
  • 10.1007/s10651-005-0006-0
Markov Modulated Poisson Processes for Clustered Line Transect Data
  • Jun 1, 2006
  • Environmental and Ecological Statistics
  • Hans J Skaug

We model the points of the detection along the transect line by a Markov modulated Poisson process (MMPP). The MMPP can accommodate the spatial cluster structure typical of many line transect surveys. The basic idea is that animal density switches between a low and a high level according to a latent Markov process. The MMPP is attractive from a mathematical point of view, as it provides an explicit expression for the likelihood function and other important quantities. We focus on estimating the level of overdispersion in the number of detected animals, as this is important for quantifying the precision of the line transect estimator of animal abundance. The approach is illustrated using both simulated data and data from a minke whale sighting survey conducted in the North Atlantic.

  • Research Article
  • 10.4028/www.scientific.net/amr.588-589.843
Recognition and Diagnosis of the Incipient Faults in Analog Circuit Using Improved HMM
  • Nov 1, 2012
  • Advanced Materials Research
  • Ji Jun Zhang + 2 more

Due to the uncertainties that exist in the running of the analog circuits, the traditional hidden Markov model (HMM) approach is improved through replacing the state transition probability (STP) matrix of the traditional model by time-varying one. An updating control factor is introduced for avoiding the excess updating of the STP in the initial stage of each state. The experimental results indicate that the improved HMM has better fault recognition and diagnosis capability.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1002/9780470400531.eorms0493
Markov and Hidden Markov Models
  • Jan 1, 2011
  • Refik Soyer

In this article we give an overview of Markov models that are used for describing dynamics of operating environments in reliability analysis. We present both continuous‐ and discrete‐time Markov chains as well as diffusion processes. We also consider Markov modulated stochastic processes that are known ashidden Markov modelsand discuss their properties. More specifically, we give an overview of Markov modulated Poisson and Markov modulated Bernoulli processes.

  • Research Article
  • Cite Count Icon 205
  • 10.1016/j.trc.2014.02.007
A Hidden Markov Model for short term prediction of traffic conditions on freeways
  • Apr 21, 2014
  • Transportation Research Part C: Emerging Technologies
  • Yan Qi + 1 more

A Hidden Markov Model for short term prediction of traffic conditions on freeways

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 64
  • 10.1007/s10548-019-00745-5
Unpacking Transient Event Dynamics in Electrophysiological Power Spectra
  • Jan 1, 2019
  • Brain Topography
  • Andrew J Quinn + 9 more

Electrophysiological recordings of neuronal activity show spontaneous and task-dependent changes in their frequency-domain power spectra. These changes are conventionally interpreted as modulations in the amplitude of underlying oscillations. However, this overlooks the possibility of underlying transient spectral ‘bursts’ or events whose dynamics can map to changes in trial-average spectral power in numerous ways. Under this emerging perspective, a key challenge is to perform burst detection, i.e. to characterise single-trial transient spectral events, in a principled manner. Here, we describe how transient spectral events can be operationalised and estimated using Hidden Markov Models (HMMs). The HMM overcomes a number of the limitations of the standard amplitude-thresholding approach to burst detection; in that it is able to concurrently detect different types of bursts, each with distinct spectral content, without the need to predefine frequency bands of interest, and does so with less dependence on a priori threshold specification. We describe how the HMM can be used for burst detection and illustrate its benefits on simulated data. Finally, we apply this method to empirical data to detect multiple burst types in a task-MEG dataset, and illustrate how we can compute burst metrics, such as the task-evoked timecourse of burst duration.

  • Dissertation
  • 10.53846/goediss-3007
On some special-purpose hidden Markov models
  • Feb 20, 2022
  • Roland Langrock

Hidden Markov models (HMMs) provide flexible devices for modelling time series of observations that depend on underlying serially correlated states. They constitute a specific class of dependent mixtures that have proved useful in many application fields. This thesis exploits the flexible mathematical structure of HMMs to develop three types of special-purpose HMMs, i.e. HMMs that differ from the standard setting and that are designed to address special demands. The first main part of the thesis considers HMMs whose matrix of state transition probabilities is structured such that it allows for arbitrary state dwell-time distributions while preserving the Markov property of the latent process. Such HMMs represent a convenient tool for approximating more flexible, but also more complicated hidden semi Markov models (HSMMs). They require fewer assumptions and enable the fitting of stationary HSMMs. Several applications illustrate the feasibility of the proposed method. In the second main part of the thesis it is shown that general-type state-space models (SSMs) can be approximated arbitrarily accurately by suitably defined HMMs. The proposed approximation method, based on HMMs, has the important advantage that it is easy to implement. Unlike the case of SSMs, where the likelihood is given by a multiple integral which cannot be evaluated directly, the likelihood of the proposed model is easy to compute; numerical maximization thus is feasible. That makes it possible to experiment with variations of models with relatively little programming effort. This is illustrated by a substantial investigation of several new variations of the well-known stochastic volatility model that were applied to series of daily returns. With reference to the recent financial crisis it is shown that a moderate increase in the flexibility, particularly of the log-volatility process, appears to enhance the model's ability to cope with extreme fluctuations of returns. Several other applications illustrate the ease with which the method can be applied to several types of SSMs. The final part of the thesis considers the modelling of sleep EEG signals via HMMs. The proposed method is applied to populations of sleep EEG time series related to well-matched subjects with and without sleep disordered breathing. The analysis confirms results from studies on sleep stage time series obtained by labour-intensive visual classification.

  • Research Article
  • Cite Count Icon 1
  • 10.3724/sp.j.1087.2009.00392
Study on audio classification based on 1-state HMM
  • Apr 7, 2009
  • Journal of Computer Applications
  • Ji-Ming Zheng + 2 more

Hidden markov model(HMM),based on statistical signal,plays an important role in content-based audio retrieval system.According to the characteristic that pays more attention to the type than to content of audio classification,1-state HMM was used for audio classification,which overcame the shortcoming of assumption of multi-state HMM model's initial state probabilities and state transition probabilities in the course of model-initializing.The experiment shows the method for audio classification based on 1-state HMM could decrease the misrecognition effectively and increase the accuracy of audio classification.

  • Research Article
  • Cite Count Icon 99
  • 10.1111/j.1365-246x.2007.03559.x
Identifying volcanic regimes using Hidden Markov Models
  • Nov 1, 2007
  • Geophysical Journal International
  • Mark S Bebbington

We examine the application of Hidden Markov Models (HMMs) to volcanic occurrences. The parameters in HMMs can be estimated from data by means of the Expectation—Maximization (EM) algorithm. Various formulations permit modelling the activity level of a volcano through onset counts, the intensity of a Markov Modulated Poisson Process (MMPP), or through the intervals between onsets. More elaborate models allow investigation of the relationship between durations and reposes. After fitting the model, the Viterbi algorithm can be used to identify the underlying (hidden) activity level of the volcano most consistent with the observations. The HMM readily provides forecasts of the next event, and is easily simulated. Data of flank eruptions 1600–2006 from Mount Etna are used to illustrate the methodology. We find that the volcano has longish periods of Poissonian behaviour, interspersed with less random periods, and that changes in regime may be more frequent than have previously been identified statistically. The flank eruptions of Mount Etna appear to have a complex time-predictable character, which is compatible with transitions between an open and closed conduit system. The relationship between reposes and durations appears to characterize the cyclic nature of the volcanoes activity.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 28
  • 10.1371/journal.pone.0114089
Using Hidden Markov Models to Improve Quantifying Physical Activity in Accelerometer Data – A Simulation Study
  • Dec 2, 2014
  • PLoS ONE
  • Vitali Witowski + 4 more

IntroductionThe use of accelerometers to objectively measure physical activity (PA) has become the most preferred method of choice in recent years. Traditionally, cutpoints are used to assign impulse counts recorded by the devices to sedentary and activity ranges. Here, hidden Markov models (HMM) are used to improve the cutpoint method to achieve a more accurate identification of the sequence of modes of PA.Methods1,000 days of labeled accelerometer data have been simulated. For the simulated data the actual sedentary behavior and activity range of each count is known. The cutpoint method is compared with HMMs based on the Poisson distribution (HMM[Pois]), the generalized Poisson distribution (HMM[GenPois]) and the Gaussian distribution (HMM[Gauss]) with regard to misclassification rate (MCR), bout detection, detection of the number of activities performed during the day and runtime.ResultsThe cutpoint method had a misclassification rate (MCR) of 11% followed by HMM[Pois] with 8%, HMM[GenPois] with 3% and HMM[Gauss] having the best MCR with less than 2%. HMM[Gauss] detected the correct number of bouts in 12.8% of the days, HMM[GenPois] in 16.1%, HMM[Pois] and the cutpoint method in none. HMM[GenPois] identified the correct number of activities in 61.3% of the days, whereas HMM[Gauss] only in 26.8%. HMM[Pois] did not identify the correct number at all and seemed to overestimate the number of activities. Runtime varied between 0.01 seconds (cutpoint), 2.0 minutes (HMM[Gauss]) and 14.2 minutes (HMM[GenPois]).ConclusionsUsing simulated data, HMM-based methods were superior in activity classification when compared to the traditional cutpoint method and seem to be appropriate to model accelerometer data. Of the HMM-based methods, HMM[Gauss] seemed to be the most appropriate choice to assess real-life accelerometer data.

  • Research Article
  • Cite Count Icon 30
  • 10.15866/irecos.v11i4.8700
Hidden Markov Model for Process Mining of Parallel Business Processes
  • Apr 30, 2016
  • International Review on Computers and Software (IRECOS)
  • Riyanarto Sarno + 1 more

One of all the works on process mining is the process discovery which produces a representation of a parallel business process. This representation is called process model and it consists of sequence and parallel control-flow patterns. The parallel control-flow patterns contain XOR, AND, and OR relations. Hidden Markov Model is rarely used to represent a process model since XOR, AND and OR relations are not visible. In Hidden Markov Model, the control-flow patterns are represented by probabilities of state transitions. This research proposes an algorithm consisting in a process discovery based on Hidden Markov Model. This algorithm contains equations and rules: the equations are used to differentiate XOR, AND, and OR relations, while the rules are used to establish the process model utilizing detected control-flow patterns. The experiment results show that the proposed algorithm obtain the right control-flow patterns in the process model. The paper demonstrates that the fitness of process models obtained by the proposed algorithm are relatively higher respect to those obtained by Heuristics Miner and Time-based Heuristics Miner algorithms. This paper also shows that the validity of process models obtained by the proposed algorithm are better than those obtained by other algorithms.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.