Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

The distribution of syntactic dependency distances

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

The syntactic structure of a sentence can be represented as a graph, where vertices are words and edges indicate syntactic dependencies between them. In this setting, the distance between two linked words is defined as the difference between their positions. Here we wish to contribute to the characterization of the actual distribution of syntactic dependency distances, which has previously been argued to follow a power-law distribution. Here we propose a new model with two exponential regimes in which the probability decay is allowed to change after a break-point. This transition could mirror the transition from the processing of word chunks to higher-level structures. We find that a two-regime model – where the first regime follows either an exponential or a power-law decay – is the most likely one in all 20 languages we considered, independently of sentence length and annotation style. Moreover, the break-point exhibits low variation across languages and averages values of 4-5 words, suggesting that the amount of words that can be simultaneously processed abstracts from the specific language to a high degree. The probability decay slows down after the breakpoint, consistently with a universal chunk-and-pass mechanism. Finally, we give an account of the relation between the best estimated model and the closeness of syntactic dependencies as function of sentence length, according to a recently introduced optimality score.

Similar Papers
  • Research Article
  • Cite Count Icon 3
  • 10.1080/09296174.2024.2400847
The Optimal Placement of the Head in the Noun Phrase. The Case of Demonstrative, Numeral, Adjective and Noun
  • Oct 20, 2024
  • Journal of Quantitative Linguistics
  • Ramon Ferrer-I-Cancho

The word order of a sentence is shaped by multiple principles. The principle of syntactic dependency distance minimization is in conflict with the principle of surprisal minimization (or predictability maximization) in single head syntactic dependency structures: while the former predicts that the head should be placed at the centre of the linear arrangement, the latter predicts that the head should be placed at one of the ends (either first or last). A critical question is when surprisal minimization (or predictability maximization) should surpass syntactic dependency distance minimization. In the context of single head structures, it has been predicted that this is more likely to happen when two conditions are met, i.e. (a) fewer words are involved and (b) words are shorter. Here, we test the prediction on the noun phrase when it is composed of demonstrative, numeral, adjective, and noun. We find that, across preferred orders in languages, the noun tends to be placed at one of the ends, confirming the theoretical prediction. We also show evidence of anti-locality effects: syntactic dependency distances in preferred orders are longer than expected by chance.

  • Research Article
  • Cite Count Icon 73
  • 10.1609/aaai.v35i14.17478
GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction
  • May 18, 2021
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Wasi Uddin Ahmad + 2 more

Recent progress in cross-lingual relation and event extraction use graph convolutional networks (GCNs) with universal dependency parses to learn language-agnostic sentence representations such that models trained on one language can be applied to other languages. However, GCNs struggle to model words with long-range dependencies or are not directly connected in the dependency tree. To address these challenges, we propose to utilize the self-attention mechanism where we explicitly fuse structural information to learn the dependencies between words with different syntactic distances. We introduce GATE, a Graph Attention Transformer Encoder, and test its cross-lingual transferability on relation and event extraction tasks. We perform experiments on the ACE05 dataset that includes three typologically different languages: English, Chinese, and Arabic. The evaluation results show that GATE outperforms three recently proposed methods by a large margin. Our detailed analysis reveals that due to the reliance on syntactic dependencies, GATE produces robust representations that facilitate transfer across languages.

  • Research Article
  • Cite Count Icon 8
  • 10.1063/1.4943542
The signature of initial production mechanisms in isotropic turbulence decay
  • Mar 1, 2016
  • Physics of Fluids
  • M Meldi

In the present work the quantification of the time-lasting effects of production mechanisms in homogeneous isotropic turbulence decay is addressed. The analysis is developed through the use of theoretical tools as well as numerical calculations based on the eddy damped quasinormal Markovian (EDQNM) model. In both cases a modified Lin equation is used, which accounts for production mechanisms as proposed by Meldi, Lejemble, and Sagaut [“On the emergence of non-classical decay regimes in multiscale/fractal generated isotropic turbulence,” J. Fluid Mech. 756, 816–843 (2014)]. The approaches used show that an exponential decay law can be observed if the intensity of the forcing is strong enough to drive the turbulence dynamics, before a power-law decay is eventually attained. The EDQNM numerical results indicate that the exponential regime can persist for long evolution times, longer than the observation time in grid turbulence experiments. A rigorous investigation of the self-similar behavior of the pressure spectrum has been performed by a comprehensive comparison of EDQNM data with direct numerical simulation (DNS)/experiments in the literature. While DNS and free decay EDQNM simulations suggest the need of a very high Reλ threshold in order to observe a clear −7/3 slope of the pressure inertial range, experimental data and forced EDQNM calculations indicate a significantly lower value. This observation suggests that the time-lasting effects of production mechanisms, which cannot be excluded in experiments, play a role in the lack of general agreement with classical numerical approaches. These results reinforce the urge to evolve the numerical simulation state of the art towards the prediction of realistic physical states.

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/o-cocosda202152914.2021.9660456
Using Local Phrase Dependency Structure Information in Neural Sequence-to-Sequence Speech Synthesis
  • Nov 18, 2021
  • Nobuyoshi Kaiki + 2 more

We introduce end-to-end text-to-speech synthesis (TTS) with prosodic symbols that represent phrase components based on local syntactic dependency structures for synthesizing Japanese speech with natural prosody. We propose two TTS models: 1) one with prosodic symbols representing the syntactic dependency distance at the phrase boundaries and 2) another with prosodic symbols that reflect a superimposed model of the phrase and accent components based on an F0 generation control mechanism. Using these two models, we observed 1) pause insertion that indicates the phrase boundary and 2) F0 resetting at the right-branching boundaries. To verify the effectiveness of these two proposed models against the conventional model using only accent components, we conducted an AB test as a subjective evaluation. Our result confirmed that synthetic speech with natural prosody, which reflects the corresponding intention to the utterance, was generated using the local phrase dependency information of sentences and the F0 generation model in a Japanese end-to-end TTS.

  • Research Article
  • Cite Count Icon 71
  • 10.1016/j.langsci.2016.09.006
The effects of genre on dependency distance and dependency direction
  • Nov 3, 2016
  • Language Sciences
  • Yaqin Wang + 1 more

The effects of genre on dependency distance and dependency direction

  • Research Article
  • Cite Count Icon 23
  • 10.1007/pl00011114
Power, Lévy, exponential and Gaussian-like regimes in autocatalytic financial systems
  • Apr 1, 2001
  • The European Physical Journal B
  • Z.F Huang + 1 more

We study by theoretical analysis and by direct numerical simulation the dynamics of a wide class of asynchronous stochastic systems composed of many autocatalytic degrees of freedom. We describe the generic emergence of truncated power laws in the size distribution of their individual elements. The exponents $\alpha$ of these power laws are time independent and depend only on the way the elements with very small values are treated. These truncated power laws determine the collective time evolution of the system. In particular the global stochastic fluctuations of the system differ from the normal Gaussian noise according to the time and size scales at which these fluctuations are considered. We describe the ranges in which these fluctuations are parameterized respectively by: the Levy regime $\alpha < 2$, the power law decay with large exponent ($\alpha > 2$), and the exponential decay. Finally we relate these results to the large exponent power laws found in the actual behavior of the stock markets and to the exponential cut-off detected in certain recent measurement.

  • Research Article
  • Cite Count Icon 40
  • 10.1103/physreve.105.014308
Optimality of syntactic dependency distances.
  • Jan 18, 2022
  • Physical Review E
  • Ramon Ferrer-I-Cancho + 3 more

It is often stated that human languages, as other biological systems, are shaped by cost-cutting pressures but, to what extent? Attempts to quantify the degree of optimality of languages by means of an optimality score have been scarce and focused mostly on English. Here we recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network where the vertices are words, arcs indicate syntactic dependencies, and the space is defined by the linear order of the words in the sentence. We introduce a score to quantify the cognitive pressure to reduce the distance between linked words in a sentence. The analysis of sentences from 93 languages representing 19 linguistic families reveals that half of languages are optimized to a 70% or more. The score indicates that distances are not significantly reduced in a few languages and confirms two theoretical predictions: that longer sentences are more optimized and that distances are more likely to be longer than expected by chance in short sentences. We present a hierarchical ranking of languages by their degree of optimization. The score has implications for various fields of language research (dependency linguistics, typology, historical linguistics, clinical linguistics, and cognitive science). Finally, the principles behind the design of the score have implications for network science.

  • Research Article
  • Cite Count Icon 104
  • 10.1006/csla.2000.0149
Maximum entropy techniques for exploiting syntactic, semantic and collocational dependencies in language modeling
  • Oct 1, 2000
  • Computer Speech &amp; Language
  • Sanjeev Khudanpur + 1 more

Maximum entropy techniques for exploiting syntactic, semantic and collocational dependencies in language modeling

  • Research Article
  • Cite Count Icon 17
  • 10.1088/1742-5468/aaa79a
Quantum return probability of a system of N non-interacting lattice fermions
  • Feb 1, 2018
  • Journal of Statistical Mechanics: Theory and Experiment
  • P L Krapivsky + 2 more

We consider N non-interacting fermions performing continuous-time quantum walks on a one-dimensional lattice. The system is launched from a most compact configuration where the fermions occupy neighboring sites. We calculate exactly the quantum return probability (sometimes referred to as the Loschmidt echo) of observing the very same compact state at a later time t. Remarkably, this probability depends on the parity of the fermion number—it decays as a power of time for even N, while for odd N it exhibits periodic oscillations modulated by a decaying power law. The exponent also slightly depends on the parity of N, and is roughly twice smaller than what it would be in the continuum limit. We also consider the same problem, and obtain similar results, in the presence of an impenetrable wall at the origin constraining the particles to remain on the positive half-line. We derive closed-form expressions for the amplitudes of the power-law decay of the return probability in all cases. The key point in the derivation is the use of Mehta integrals, which are limiting cases of the Selberg integral.

  • Research Article
  • Cite Count Icon 7
  • 10.1177/1461445619866985
A computational model for measuring discourse complexity
  • Aug 2, 2019
  • Discourse Studies
  • Kun Sun + 1 more

In past studies, the few quantitative approaches to discourse structure were mostly confined to the presentation of the frequency of discourse relations. However, quantitative approaches should take into account both hierarchical and relational layers in the discourse structure. This study considers these factors and addresses the issue of how discourse relations and discourse units are related. It draws upon the available corpora of discourse structure (rhetorical structure theory-discourse treebank (RST-DT)) from a new perspective. Since an RST tree can be converted into a syntactic dependency tree, the data extracted from the RST-DT can be useful for calculating the discourse distance in much the same way as syntactic dependency distance is calculated. Discourse distance is also applicable to measuring the depth of the human processing of discourse. Furthermore, the data derived from the RST-DT are also easily converted into network data. This study finds that discourse structure has its discourse distance minimum and each type of RST relations has its range of discourse distance. The frequency distribution of discourse data basically follows the power law on several levels, while a network approach reveals how discourse units are arranged spatially in regular patterns. The two methods are mutually complementary in revealing the interaction between discourse relations and discourse units in a comprehensive manner, as well as in revealing how people process and comprehend discourse dynamically. Accordingly, we propose merging the two methods so as to yield a computational model for assessing discourse complexity and comprehension.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 26
  • 10.3390/s24020418
Modeling Structured Dependency Tree with Graph Convolutional Networks for Aspect-Level Sentiment Classification
  • Jan 10, 2024
  • Sensors (Basel, Switzerland)
  • Qin Zhao + 3 more

Aspect-based sentiment analysis is a fine-grained task where the key goal is to predict sentiment polarities of one or more aspects in a given sentence. Currently, graph neural network models built upon dependency trees are widely employed for aspect-based sentiment analysis tasks. However, most existing models still contain a large amount of noisy nodes that cannot precisely capture the contextual relationships between specific aspects. Meanwhile, most studies do not consider the connections between nodes without direct dependency edges but play critical roles in determining the sentiment polarity of an aspect. To address the aforementioned limitations, we propose a Structured Dependency Tree-based Graph Convolutional Network (SDTGCN) model. Specifically, we explore construction of a structured syntactic dependency graph by incorporating positional information, sentiment commonsense knowledge, part-of-speech tags, syntactic dependency distances, etc., to assign arbitrary edge weights between nodes. This enhances the connections between aspect nodes and pivotal words while weakening irrelevant node links, enabling the model to sufficiently express sentiment dependencies between specific aspects and contextual information. We utilize part-of-speech tags and dependency distances to discover relationships between pivotal nodes without direct dependencies. Finally, we aggregate node information by fully considering their importance to obtain precise aspect representations. Experimental results on five publicly available datasets demonstrate the superiority of our proposed model over state-of-the-art approaches; furthermore, the accuracy and F1-score show a significant improvement on the majority of datasets, with increases of 0.74, 0.37, 0.65, and 0.79, 0.75, 1.17, respectively. This series of enhancements highlights the effective progress made by the STDGCN model in enhancing sentiment classification performance.

  • Research Article
  • Cite Count Icon 6
  • 10.1021/jp076753q
Dynamics of Barrierless and Activated Chemical Reactions in a Dispersive Medium within the Fractional Diffusion Equation Approach
  • Jan 8, 2008
  • The Journal of Physical Chemistry B
  • K Seki + 2 more

Barrierless chemical reactions have often been modeled as a Brownian motion on a one-dimensional harmonic potential energy surface with a position-dependent reaction sink or window located near the minimum of the surface. This simple (but highly successful) description leads to a nonexponential survival probability only at small to intermediate times but exponential decay in the long-time limit. However, in several reactive events involving proteins and glasses, the reactions are found to exhibit a strongly nonexponential (power law) decay kinetics even in the long time. In order to address such reactions, here, we introduce a model of barrierless chemical reaction where the motion along the reaction coordinate sustains dispersive diffusion. A complete analytical solution of the model can be obtained only in the frequency domain, but an asymptotic solution is obtained in the limit of long time. In this case, the asymptotic long-time decay of the survival probability is a power law of the Mittag-Leffler functional form. When the barrier height is increased, the decay of the survival probability still remains nonexponential, in contrast to the ordinary Brownian motion case where the rate is given by the Smoluchowski limit of the well-known Kramers' expression. Interestingly, the reaction under dispersive diffusion is shown to exhibit strong dependence on the initial state of the system, thus predicting a strong dependence on the excitation wavelength for photoisomerization reactions in a dispersive medium. The theory also predicts a fractional viscosity dependence of the rate, which is often observed in the reactions occurring in complex environments.

  • Research Article
  • Cite Count Icon 60
  • 10.1103/physrevlett.122.130401
Experimental Investigation of Quantum Decay at Short, Intermediate, and Long Times via Integrated Photonics.
  • Apr 3, 2019
  • Physical Review Letters
  • Andrea Crespi + 7 more

The decay of an unstable system is usually described by an exponential law. Quantum mechanics predicts strong deviations of the survival probability from the exponential: Indeed, the decay is initially quadratic, while at very large times it follows a power law, with superimposed oscillations. The latter regime is particularly elusive and difficult to observe. Here we employ arrays of single-mode optical waveguides, fabricated by femtosecond laser direct inscription, to implement quantum systems where a discrete state is coupled and can decay into a continuum. The optical modes correspond to distinct quantum states of the photon, and the temporal evolution of the quantum system is mapped into the spatial propagation coordinate. By injecting coherent light states in the fabricated photonic structures and by measuring a small scattered fraction of such light with an unprecedented dynamic range, we are able to experimentally observe not only the exponential decay regime, but also the quadratic Zeno region and the power-law decay at long evolution times.

  • Research Article
  • Cite Count Icon 26
  • 10.1016/j.ins.2020.03.022
Natural language modeling with syntactic structure dependency
  • Apr 1, 2020
  • Information Sciences
  • Kai Shuang + 3 more

Natural language modeling with syntactic structure dependency

  • Research Article
  • Cite Count Icon 9
  • 10.1088/1751-8121/ab3305
Running measurement protocol for the quantum first-detection problem
  • Aug 2, 2019
  • Journal of Physics A: Mathematical and Theoretical
  • Dror Meidan + 2 more

The problem of the detection statistics of a quantum walker has received increasing interest. We investigate the effect of employing a moving detector, using a projective measurement approach with fixed sampling time , with the detector moving right before every detection attempt. For a tight-binding quantum walk on the line, the moving detector allows one to target a specific range of group velocities of the walker, qualitatively modifying the behavior of the quantum first-detection probabilities. We map the problem to that of a stationary detector with a modified unitary evolution operator and use established methods for the solution of that problem to study the first-detection statistics for a moving detector on a finite ring and on an infinite 1D lattice. On the line, the system exhibits a dynamical phase transition at a critical value of , from a state where the probability of detection decreases exponentially in time and the total detection probability is very small, to a state with power-law decay and a significantly higher total probability to detect the particle. The exponent describing the power-law decay of the detection probability at this critical is 10/3, as opposed to 3 for every larger . In addition, the moving detector strongly modifies the Zeno effect.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant