Lifting Dichotomies
Abstract Lifting theorems are used to transfer lower bounds between Boolean function complexity measures. Given a lower bound on a complexity measure $$A$$ A for some function $$f$$ f , we compose $$f$$ f with a carefully chosen gadget function $$g$$ g and get essentially the same lower bound on a complexity measure $$B$$ B for the lifted function $$f \diamond g$$ f ⋄ g . Lifting theorems have applications in many different areas, such as circuit complexity, communication complexity, proof complexity, etc. One of the main questions in the context of lifting is how to choose a suitable gadget $$g$$ g . Generally, to get better results, i.e., to minimize the losses when transferring lower bounds, we need the gadget to be of a constant size (number of inputs). Unfortunately, in many settings we know lifting results only for gadgets of size that grows with the size of $$f$$ f , and it is unclear whether they can be improved to constant-size gadgets. This motivates us to identify the properties of gadgets that make lifting possible. In this paper, we systematically study the question: ‘For which gadgets does the lifting result hold?’ in the following four settings: lifting from decision tree depth to decision tree size, lifting from conjunction DAG width to conjunction DAG size, lifting from decision tree depth to parity decision tree depth and size, and lifting from block sensitivity to deterministic and randomized communication complexities. In all the cases, we prove the complete classification of gadgets by exposing the properties of gadgets that make lifting results hold. The structure of the results shows that there are no intermediate cases—for every gadget, there is either a polynomial lifting or no lifting at all. As a byproduct of our studies, we prove the log-rank conjecture for the class of functions that can be represented as $$f\diamond OR \diamond XOR$$ f ⋄ O R ⋄ X O R for some function $$f$$ f .
- Conference Article
86
- 10.1145/129712.129730
- Jan 1, 1992
We describe two methods for estimating the size and depth of decision trees where a linear test is performed at each node. Both methods are applied to the question of deciding, by a linear decision tree, whether given n real numbers, some k of them are equal. We show that the minimum depth of a linear decision tree for this problem is Θ(n log(n/k)). The upper bound is easy; the lower bound can be established for k = O(n1/4−e) by a volume argument; for the whole range, however, our proof is more complicated and it involves the use of some topology as well as the theory of Mobius functions.
- Research Article
49
- 10.1090/s0894-0347-1994-1243770-0
- Jan 1, 1994
- Journal of the American Mathematical Society
Topological methods are described for estimating the size and depth of decision trees where a linear test is performed at each node. The methods are applied, among others, to the questions of deciding by a linear decision tree whether givennnreal numbers (1) somekkof them are equal, or (2) somekkof them are unequal. We show that the minimum depth of a linear decision tree for these problems is at least (1)max{n−1,nlog3(n/3k)}{\text {max}}\{ n - 1,\quad n\;{\text {lo}}{{\text {g}}_3}(n/3k)\}, and (2)max{n−1,nlog3(k−1)−k+1}{\text {max}}\{ n - 1,\quad n\;{\text {lo}}{{\text {g}}_3}(k - 1) - k + 1\}. Our main lower bound for the size of linear decision trees for polyhedraPPinRn{{\mathbf {R}}^n}is given by the sum of Betti numbers for the complementRn∖P{{\mathbf {R}}^n}\backslash P. The applications of this general topological bound involve the computation of the Möbius function of intersection lattices of certain subspace arrangements. In particular, this leads to computing various expressions for the Möbius function of posets of partitions with restricted block sizes. Some of these formulas have topological meaning. For instance, we derive a formula for the Euler characteristic of the subset ofRn{{\mathbf {R}}^n}of points with nokkcoordinates equal in terms of the roots of the truncated exponential∑i>kxi/i!\sum \nolimits _{i > k} {{x^i}} /i!.
- Research Article
11
- 10.1007/s000370050010
- Dec 1, 1998
- Computational Complexity
We prove an exponential lower bound on the size of any fixed degree algebraic decision tree for solving MAX, the problem of finding the maximum of n real numbers. This complements the n— 1 lower bound of [Rabin (1972)] on the depth of algebraic decision trees for this problem. The proof in fact gives an exponential lower bound on the size of the polyhedral decision problem MAX= for testing whether the j-th number is the maximum among a list of n real numbers. Previously, except for linear decision trees, no nontrivial lower bounds on the size of algebraic decision trees for any familiar problems are known. We also establish an interesting connection between our lower bound and the maximum number of minimal cutsets for any rank-d hypergraph on n vertices.
- Research Article
- 10.1145/333580.333587
- Sep 1, 1999
- ACM Computing Surveys
Communication Complexity · 3 —Size of distinct models of branching programs —Depth of decision trees —Data structure problems. To illustrate the progress covered by the above list we mention two specific contributions. The first superlinear lower bound on the size of planar Boolean circuits computing a specific Boolean function and the first superpolylogarithmic lower bounds on the depth of monotone Boolean circuits have been established. The big success of communication complexity application should not to be surprising because we have information transfer in all computing models (for instance, between two parts of input data , between some parts (processors) of a parallel computing model, between two time moments, etc.). So, you can cut hardware, time, or both in your computing model, and then apply lower bounds on the communication complexity of your computing problem. In this way you have a lower bound on the information transfer that must be realized in the computing model considered in order to compute the given task. The appropriate choice of the cut is crucial for obtaining good lower bounds. One of the perspectives is to extend the applications for proving lower bounds for multilective and/or non-oblivious computing models. This is one of the hardest tasks of special importance in complexity theory. The recent results show that using Ramsey theory and communication complexity over overlapping (not disjoint) partitions of inputs one has good chances to achieve progress in this hard topic too. 3. NONDETERMINISTIC AND RANDOMIZED COMPUTATIONS One of the central principal questions of current theoretical computer science is which computational power have nondeterministic and randomized computations, especially in the comparison with the deterministic one. The fundamental questions about polynomial time computations (like P versus NP, P versus ZPP, P versus R) are long-stated open problems. For communication complexity the research has been successful and the relation between determinism, nondeterminism and randomness has been fixed. This has essentially contributed to the understanding of the nature of randomness and nondeterminism. Some of the main results are the following ones: (1) There are exponential gaps between —determinism and Monte Carlo randomness —nondeterminism and bounded error probabilism. (2) Deterministic communication can be bounded by at most twice the product of nondeterministic communication of the language and its complement. This implies an at most quadratic gap between determinism and Las Vegas randomization. A language having this quadratic gap has been found. (3) There is a linear gap between determinism and Las Vegas randomness for oneway communication complexity. (4) O(log n) random bits are sufficient to reach the full power of randomized communication for Las Vegas and Monte Carlo (error-bounded) protocols. (5) In contrast to 4. there exist high thresholds on the amount of nondeterminism (for some computing problems the deterministic communication complexity is
- Research Article
19
- 10.1145/3230742
- Nov 26, 2018
- Journal of the ACM
We introduce a new and natural algebraic proof system, whose complexity measure is essentially the algebraic circuit size of Nullstellensatz certificates. This enables us to exhibit close connections between effective Nullstellensatzë, proof complexity, and (algebraic) circuit complexity. In particular, we show that any super-polynomial lower bound on any Boolean tautology in our proof system implies that the permanent does not have polynomial-size algebraic circuits (VNP ≠ VP). We also show that super-polynomial lower bounds on the number of lines in Polynomial Calculus proofs imply the Permanent versus Determinant Conjecture. Note that there was no proof system prior to ours for which lower bounds on an arbitrary tautology implied any complexity class lower bound. Our proof system helps clarify the relationships between previous algebraic proof systems. In doing so, we highlight the importance of polynomial identity testing (PIT) in proof complexity. In particular, we use PIT to illuminate AC 0 [ p ]-Frege lower bounds, which have been open for nearly 30 years, with no satisfactory explanation as to their apparent difficulty. Finally, we explain the obstacles that must be overcome in any attempt to extend techniques from algebraic circuit complexity to prove lower bounds in proof complexity. Using the algebraic structure of our proof system, we propose a novel route to such lower bounds. Although such lower bounds remain elusive, this proposal should be contrasted with the difficulty of extending AC 0 [ p ] circuit lower bounds to AC 0 [ p ]-Frege lower bounds.
- Research Article
14
- 10.1016/j.ins.2023.119252
- May 30, 2023
- Information Sciences
Z-number-valued rule-based decision trees
- Conference Article
10
- 10.1109/fskd.2014.6980959
- Aug 1, 2014
Post-pruning is a common method of decision tree pruning. However, various post-pruning tends to use a single measure as an evaluation standard of pruning effects. The single and exclusive index evaluation standard of decision tree is subjective and partial, and the decisions after pruning often have a bias. This paper proposes a decision tree post-pruning algorithm based on comprehensive considering various evaluation standards. At the same time considering the classification ability, stability and size, so as to reflect the integrity advantage of the decision tree. The user can choose each standard component weight value according to actual demand, to get a decision tree which has a tendency to meet the actual demand. The experimental results show that the post-pruning algorithm considering the classification accuracy, stability and the size of decision tree, in classification accuracy unchanged or fall under the premise of tiny range, makes a decision tree has a more balanced classification performance and less model complexity.
- Research Article
3
- 10.7916/d8w66sm1
- Jan 1, 2009
A longstanding lacuna in the field of computational learning theory is the learnability of succinctly representable monotone Boolean functions, i.e., functions that preserve the given order of the input. This thesis makes significant progress towards understanding both the possibilities and the limitations of learning various classes of monotone functions by carefully considering the complexity measures used to evaluate them. We show that Boolean functions computed by polynomial-size monotone circuits are hard to learn assuming the existence of one-way functions. Having shown the hardness of learning general polynomial-size monotone circuits, we show that the class of Boolean functions computed by polynomial-size depth-3 monotone circuits are hard to learn using statistical queries. As a counterpoint, we give a statistical query learning algorithm that can learn random polynomial-size depth-2 monotone circuits (i.e., monotone DNF formulas). As a preliminary step towards a fully polynomial-time, proper learning algorithm for learning polynomial-size monotone decision trees, we also show the relationship between the average depth of a monotone decision tree, its average sensitivity, and its variance. Finally, we return to monotone DNF formulas, and we show that they are teachable (a different model of learning) in the average case. We also show that non-monotone DNF formulas, juntas, and sparse GF2 formulas are teachable in the average case.
- Book Chapter
6
- 10.1007/978-3-642-22993-0_51
- Jan 1, 2011
A linear decision tree is a binary decision tree in which a classification rule at each internal node is defined by a linear threshold function. In this paper, we consider a linear decision tree T where the weights w 1, w 2, ..., w n of each linear threshold function satisfy ∑ i |w i | ≤ w for an integer w, and prove that if T computes an n-variable Boolean function of large unbounded-error communication complexity (such as the Inner-Product function modulo two), then T must have \(2^{\Omega ( \sqrt{n} )}/w\) leaves. To obtain the lower bound, we utilize a close relationship between the size of linear decision trees and the energy complexity of threshold circuits; the energy of a threshold circuit C is defined to be the maximum number of gates outputting “1,” where the maximum is taken over all inputs to C. In addition, we consider threshold circuits of depth ω(1) and bounded energy, and provide two exponential lower bounds on the size (i.e., the number of gates) of such circuits.
- Research Article
2
- 10.1609/aaai.v37i7.25963
- Jun 26, 2023
- Proceedings of the AAAI Conference on Artificial Intelligence
We study the problem of explainability-first clustering where explainability becomes a first-class citizen for clustering. Previous clustering approaches use decision trees for explanation, but only after the clustering is completed. In contrast, our approach is to perform clustering and decision tree training holistically where the decision tree's performance and size also influence the clustering results. We assume the attributes for clustering and explaining are distinct, although this is not necessary. We observe that our problem is a monotonic optimization where the objective function is a difference of monotonic functions. We then propose an efficient branch-and-bound algorithm for finding the best parameters that lead to a balance of clustering accuracy and decision tree explainability. Our experiments show that our method can improve the explainability of any clustering that fits in our framework.
- Single Book
2151
- 10.1017/cbo9780511804090
- Apr 20, 2009
This beginning graduate textbook describes both recent achievements and classical results of computational complexity theory. Requiring essentially no background apart from mathematical maturity, the book can be used as a reference for self-study for anyone interested in complexity, including physicists, mathematicians, and other scientists, as well as a textbook for a variety of courses and seminars. More than 300 exercises are included with a selected hint set. The book starts with a broad introduction to the field and progresses to advanced results. Contents include: definition of Turing machines and basic time and space complexity classes, probabilistic algorithms, interactive proofs, cryptography, quantum computation, lower bounds for concrete computational models (decision trees, communication complexity, constant depth, algebraic and monotone circuits, proof complexity), average-case complexity and hardness amplification, derandomization and pseudorandom constructions, and the PCP theorem.
- Research Article
8
- 10.3233/fi-2000-41303
- Jan 1, 2000
- Fundamenta Informaticae
In the paper, infinite information systems are considered which are used in pattern recognition, discrete optimization, computational geometry. Depth and size of deterministic and nondeterministic decision trees over such information systems are studied. Two classes of infinite information systems are investigated. Systems from these classes are best from the point of view of time complexity and space complexity of deterministic as well as nondeterministic decision trees. In proofs methods of test theory [1] and rough set theory [6, 9] are used.
- Conference Article
2
- 10.15439/2014f256
- Sep 29, 2014
We used decision tree as a model to discover the knowledge from multi-label decision tables where each row has a set of decisions attached to it and our goal is to find out one arbitrary decision from the set of decisions attached to a row. The size of the decision tree can be small as well as very large. We study here different greedy as well as dynamic programming algorithms to minimize the size of the decision trees. When we compare the optimal result from dynamic programming algorithm, we found some greedy algorithms produce results which are close to the optimal result for the minimization of number of nodes (at most 18.92% difference), number of nonterminal nodes (at most 20.76% difference), and number of terminal nodes (at most 18.71% difference).
- Research Article
1
- 10.3217/jucs-020-09-1174
- Jan 9, 2014
- Journal of Universal Computer Science
Classification is a constitutive part in many different fields of Computer Science. There exist several approaches that capture and manipulate classification information in order to construct a specific classification model. These approaches are often tightly coupled to certain learning strategies, special data structures for capturing the models, and to how common problems, e.g. fragmentation, replication and model overfitting, are addressed. In order to unify these different classification approaches, we define a Decision Algebra which defines models for classification as higher order decision functions abstracting from their implementations using decision trees (or similar), decision rules, decision tables, etc. Decision Algebra defines operations for learning, applying, storing, merging, approximating, and manipulating models for classification, along with some general algebraic laws regardless of the implementation used. The Decision Algebra abstraction has several advantages. First, several useful Decision Algebra operations (e.g., learning and deciding) can be derived based on the implementation of a few core operations (including merging and approximating). Second, applications using classification can be defined regardless of the different approaches. Third, certain properties of Decision Algebra operations can be proved regardless of the actual implementation. For instance, we show that the merger of a series of probably accurate decision functions is even more accurate, which can be exploited for efficient and general online learning. As a proof of the Decision Algebra concept, we compare decision trees with decision graphs, an efficient implementation of the Decision Algebra core operations, which capture classification models in a non-redundant way. Compared to classical decision tree implementations, decision graphs are 20% faster in learning and classification without accuracy loss and reduce memory consumption by 44%. This is the result of experiments on a number of standard benchmark data sets comparing accuracy, access time, and size of decision graphs and trees as constructed by the standard C4.5 algorithm. Finally, in order to test our hypothesis about increased accuracy when merging decision functions, we merged a series of decision graphs constructed over the data sets. The result shows that on each step the accuracy of the merged decision graph increases with the final accuracy growth of up to 16%.
- Book Chapter
- 10.1007/11527503_17
- Jan 1, 2005
For interactive data mining of very large databases a method working with relatively small training data that can be extracted from the target databases by sampling is proposed, because it takes very long time to generate decision trees for the data mining of very large databases that contain many continues data values, and size of decision trees has the tendency of dependency on the size of training data. The method proposes to use samples of confidence in proper size as the training data to generate comprehensible trees as well as to save time. For medium or small databases direct use of original data with some harsh pruning may be used, because the pruning generates trees of similar size with smaller error rates.KeywordsDecision TreeAssociation RuleTree SizeDecision Tree AlgorithmFeature Subset SelectionThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.