Abstract

<p class="p1">In this work, we focus on the problem of “annotation tagging” over Information <span style="font-size: 10px;">Spaces of objects stored in a full-text index. In such a scenario, tags are </span><span style="font-size: 10px;">assigned to objects by “data curator” users with the purpose of classification, while </span><span style="font-size: 10px;">generic end-users will perceive tags as searchable and browsable object properties. </span><span style="font-size: 10px;">To carry out their activities, data curators need “annotation tagging tools” which </span><span style="font-size: 10px;">allow them to “bulk” tag or untag large sets of objects in temporary work sessions, </span><span style="font-size: 10px;">where they can “virtually” and in “real-time” experiment the effect of their actions </span><span style="font-size: 10px;">before making the changes visible to end-users. The implementation of these tools </span><span style="font-size: 10px;">over full-text indexes is a challenge, since bulk object updates in this context are </span><span style="font-size: 10px;">far from being real-time and in critical cases may slow down index performance. </span><span style="font-size: 10px;">We devised TagTick, a tool which offers to data curators a fully functional annotation </span><span style="font-size: 10px;">tagging environment over the full-text index Apache Solr, regarded as a </span><span style="font-size: 10px;">“de-facto standard” in this area. TagTick consists of a TagTick Virtualizer module, </span><span style="font-size: 10px;">which extends the APIs of Solr to support real-time, virtual, bulk-tagging operations, </span><span style="font-size: 10px;">and a TagTick User Interface module, which offers end-user functionalities </span><span style="font-size: 10px;">for annotation tagging. The tool scales optimally with the number and size of bulk </span><span style="font-size: 10px;">tag operations, without compromising index performance.</span></p>

Highlights

  • Tags are generally conceived as nonhierarchical terms assigned to an information object in order to enrich its description beyond the one provided by object properties

  • The results presented in figure 5 show that the average time for the execution of search and browse queries always remain under 2 seconds, which we can consider under the “real-time” threshold from the point of view of the users

  • We presented TagTick, a tool devised to enable annotation tagging functionalities over Solr instances

Read more

Summary

INTRODUCTION

Tags are generally conceived as nonhierarchical terms (or keywords) assigned to an information object (e.g., a digital image, a document, a metadata record) in order to enrich its description beyond the one provided by object properties. The commitSession(s) command is responsible for updating the initial Information Space I to the changes applied in s, i.e. add and remove tags to objects in I according to the actions in s To this aim, the module relies on the map ρ, which associates each tag (i, t) to the set of objects virtually tagged by (i, t) in s, and on the low-level function andNotDocsets. The TagTick User Interface allows users to search for objects of all classes by means of free keywords and to refine such searches by class and by the tags relative to such class This combination of predicates, which matches the query structure q = q And qtags in session expected by the TagTick Virtualizer, is executed by the module and the results presented in the interface. This is a one-time operation, required only when logging in to the tool

CONCLUSIONS
24. The HOPE Portal

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.