Querying of Geo-Textual Web Content: Concepts and Techniques

Geo-textual web content is content accessible on the web that has a text component as well as a geo-location, in addition to possibly other attributes such as a timestamp or a rating. Increasing volumes of suchobjects are available on the web, including web pages, business directory entries, and microblog posts.

Geo-textual web content is not only relevant but also important to many human activities. Thus, studies suggest that each week, several billion keyword-based queries are issued that have some form of local intent and target geo-textual web content. Such queries aim to find relevant content in an implied geographical region.

This state of affairs gives prominence to the capability of efficiently computing queries that retrieve useful content from large collections of geo-textual web content. A standard keyword-based query takes a user location and user-supplied keywords as arguments, and it returns content that is geographically and textually relevant to these arguments. Due perhaps to the rich semantics of geographical space and its importance to our daily lives, many different kinds of useful geo-textual web query functionality may be envisioned. For example, some queries aim to find a few near-by points of interest that each satisfy a user’s needs as indicated by query keywords, other queries aim to find a set of points of interest that collectively satisfy a user’s needs, and yet other queries aim to find densely populated regions that enable a user to conveniently explore different relevant points of interest.

Based on recent and ongoing work by the speaker and his colleagues, the talk presents key functionality, concepts, and techniques relating to the querying of geo-textual web content; it covers functionality that addresses different kinds of user intent; and it offers directions for the future development of keyword-based geo-textual web querying.

Bio: Christian S. Jensen is Obel Professor of Computer Science at Aalborg University, Denmark, and he was previously with Aarhus University for three years and spent a one-year sabbatical at Google Inc., Mountain View. His research concerns data management and data-intensive systems, and its focus is on temporal and spatio-temporal data management. Christian is an ACM and an IEEE Fellow, and he is a member of Academia Europaea, the Royal Danish Academy of Sciences and Letters, and the Danish Academy of Technical Sciences. He has received several national and international awards for his research. He is Editor in Chief of ACM Transactions on Database Systems.