avocets
Avocets
rss 2.0 subscribe to this page
search


view all
•  projects
•  owners
•  tags
NC States project page -- the public page at least.

CQL::Parser is a Perl module for parsing Common Query Language statements.

CQL is a formal language for representing queries to information retrieval systems such as web indexes, bibliographic catalogs and museum collection information. The design objective is that queries be human readable and writable, and that the language be intuitive while maintaining the expressiveness of more complex languages.

CQL::Parser will allow you validate statements, and parse them into a parse tree which you can then programatically walk and use. For your convenince there are methods for converting the CQL parse tree into Swish and Lucene queries as well as XCQL (an XML representation of CQL).

  use CQL::Parser;
  my $parser = CQL::Parser->new();
  my $root = $parser->parse('dc.creator="clinton"');
  my $swish = $root->toSwish();
  my $lucene = $root->toLucene();
  my $xcql = $root->toXCQL();
tagged SRU discovery dlf_spring_2006 searching by winkler4 ...on 11-APR-06
Nice layout to look at for LoST
tagged cni_spring_2006 development lost searching by winkler4 ...on 04-APR-06
We present a new method for content management and knowledge discovery using a topology-preserving neural network. The method, termed topological organization of content (TOC), can generate a taxonomy of topics from a set of unannotated, unstructured documents. The TOC consists of a hierarchy of self-organizing growing chains (GCs), each of which can develop independently in terms of size and topics. The dynamic development process is validated continuously using a proposed entropy-based Bayesian information criterion (BIC). Each chain meeting the criterion spans child chains, with reduced vocabularies and increased specializations. This results in a topological tree hierarchy, which can be browsed like a table of contents directory or web portal. A brief review is given on existing methods for document clustering and organization, and clustering validation measures. The proposed approach has been tested and compared with several existing methods on real world web page datasets. The results have clearly demonstrated the advantages and efficiency in content organization of the proposed method in terms of computational cost and representation. The TOC can be easily adapted for large-scale applications. The topology provides a unique, additional feature for retrieving related topics and confining the search space.