An identifying code $C$ of a graph $G$ is a dominating set of $G$ such that any two distinct vertices of $G$ have distinct closed neighbourhoods within $C$. These codes have been widely studied for over two decades. We give an improvement over all the best known upper bounds, some of which have stood for over 20 years, for identifying codes in trees, proving the upper bound of $(n+\ell)/2$, where $n$ is the order and $\ell$ is the number of leaves (pendant vertices) of the graph. In addition to being an improvement in size, the new upper bound is also an improvement in generality, as it actually holds for bipartite graphs having no twins (pairs of vertices with the same closed or open neighbourhood) of degree 2 or greater. We also show that the bound is tight for an infinite class of graphs and that there are several structurally different families of trees attaining the bound. We then use our bound to derive a tight upper bound of $2n/3$ for twin-free bipartite graphs of order $n$, and characterize the extremal examples, as $2$-corona graphs of bipartite graphs. This is best possible, as there exist twin-free graphs, and trees with twins, that need $n-1$ vertices in any of their identifying codes. We also generalize the existing upper bound of $5n/7$ for graphs of order $n$ and girth at least 5 when there are no leaves, to the upper bound $\frac{5n+2\ell}{7}$ when leaves are allowed. This is tight for the $7$-cycle $C_7$ and for all stars.
Read full abstract