Abstract

Those of you who were here for last year’s conference might remember that I was here too and chaired one of the sessions. Last year, I mentioned that we had just started a new project called “VolltextsucheOnline” which stands for “full-text search online”. Today, I am here to present this project again. I am quite proud to be able to tell you that it is now up and running, following the launch of the website at the last Frankfurt Book Fair. Furthermore, we have changed the name of the project, because full-text search, or searches in the full text, is just one small functionality in the whole set of features we want to offer. The project’s new name is libreka!. We also wanted to make the point that we are not a “YACS” – “Yet Another Californian Start-up”. On the contrary, MVB is based at the heart of the German publishing industry and is wholly owned by the Borsenverein, the German Publishers and Booksellers Association. Now, what do we want to achieve with libreka!? To put it very simply: we want to put every Germanlanguage book in print online. We want to be the leading internet platform for the German book trade or, figuratively speaking, we want to be the Swiss army knife of e-content. First of all, we have a marketing objective. We want to increase the reach of publishers and booksellers. We want to open up new business models and sales channels for booksellers and publishers and want to protect intellectual property. We want to give publishers control over how much and what they publish on the internet. Now, what do I mean when I say: “We want to increase the reach of the publisher or bookseller”? First of all, by putting every book in print online with libreka, we want to give end-users access via the internet to culture, to knowledge, to education. At the same time, we offer every bookseller a service similar to the “Search Inside” offered by a large online retailer. Furthermore, we want to integrate the content on libreka! in the main search engines Google, Yahoo and Microsoft. If you look for a book that is on libreka! today, you will probably find it on Google as well. But what we want is to drive initiatives like “Google Book Search” or “Microsoft Life Search” by adding our content and therefore making it more easily available. At the same time, libreka! gives booksellers and publishers the opportunity to offer their customers a search inside the book-function and a chance to read examples of the books on libreka!. The integration in the search engines basically works in two ways. One way is very easy: the search engine indexes our html-pages. This is what they are already doing today. The second model is more complicated: the data we get from publishers is in pdf-format, though it is not a perfect format to use. We want to move to xml-format, but this data is rather difficult to get. That is why we are stuck with pdf for the moment. What we do is to extract the text from the pdf and build our own text index – this is purely text, no images and no meta-information about the book. Then we give access to that text

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call