From the website:
he Apache Directory Server is an embeddable LDAP server written in Java. It has been designed to introduce triggers, stored procedures, queues and views to the world of LDAP which has lacked these rich constructs.
From the website:
Roller is the open source blog server that drives Sun Microsystem's blogs.sun.com employee blogging site, IBM DeveloperWorks blogs, thousands of internal blogs at IBM Blog Central, the Javalobby's 10,000 user strong JRoller Java community site, and hundreds of other blogs world-wide. Read more about Roller on the About page.
SearchTools User Experience RecommendationsWhile search engine pages share many interface elements with other parts of web site design, there are certain principals that you should keep in mind:
- Put a simple, reasonably long search field on every page of the site.
- Use simple words to explain the process: remove all jargon and technical terms, and make sure that any icons have labels.
- Avoid inventing a new interface, which will confuse users: take the best of the formats of the large public search engines
- Make the search forms and results pages fit into the overall design of the web site: they should use the same colors, fonts and so on.
- Include site names and navigation links into results pages, so users can see the context and structure of the site.
- Set up a special page to be displayed when the search does not find any matches in the index (see No-Matches Page Guidelines)
- Avoid surprises: clarify all automated search features, such as stemming, phonetic matching, thesaurus lookups and stopwords (see Glossary).
Northern Light Enterprise Search Engine Features
Performance. With a database indexing 5 million documents totaling 25 gigabytes of content, and using a single query server with a single software installation, Northern Light is rated at 216 queries per second with a query response time averaging 80 milliseconds.
Scalability. Northern Light can search databases of more than 50 million documents with a single software installation on a single server. (Unlike some enterprise search engine vendors that require you to add another server appliance every time you want to add as few as 150,000 documents to your database.)
Relevance ranking effectiveness. Northern Light has a unique seventeen-factor approach to relevance ranking that considers statistical text measures, hyperlink analysis, subject classification, and date - and balances all these dynamically to weight the factors based on what will be most useful for a given query. What, you ask, are statistical text measures? Well, a few examples would be the number of times the query terms are in the document relative to the length of the document, the proximity of the query terms in the document, the word order of the query terms in the document, the presence of the query terms in the document metadata, and the inverse term frequency of the query terms in the database as a whole.
Automatic classification. Northern Light has patented, proprietary technology that classifies every document in the database by subject, type, language, and source. We provide a complete 17,000-node subject taxonomy developed by our expert gang of librarians that is extensible and customizable. Our classification powers advanced search forms, vertical search applications, and our patented Custom Search Folders™ for results navigation.
Flexible query parsing. Northern Light allows keywords, Boolean expressions (all operators, compound, and nested), natural language, phrase searching, wildcards, and any combination of these.
Search on any metadata. All metadata is represented in the index, which means you can use search forms or syntax to qualify the results. Search on title, sources, documents types, etc. You can add any metadata that makes sense to your organization and search on that tag.
Security. Northern Light integrates with your network authentication, and all security protocols are observed. That means users can only access information for which they're authorized, and you can easily add and remove users.
Open API. Our search engine has well-documented API's using J2EE standards, XML search results, and JSP sample code that support the integration of Northern Light into corporate applications.
Content integration. Our published load format specification allows any file type, from any source, located anywhere to be indexed and searched. The data conversion system includes filters for Microsoft Office, PDF, HTML (including JSP and ASP), and text formats including XML.
Discovery-based crawler. Northern Light's crawler follows links on your network to discover content for indexing. The crawler connects via HTTP, HTTPS, FTP, NFS and SMB (Windows) protocols and supports multiple authentication methods.
Administration tools. We provide a browser-based administration system that includes a basic search UI, a scheduler to manage crawling, data conversion, database loading, and a system configuration manager.
Platforms. Northern Light is available on LINUX.
This is a very nice example of the power of faceted searching. From the webiste:
MSRA SRC Toolbar is a tool for searching web with the Search Result Clustering (SRC) technique, which is developed at Web Search and Mining Group in MSR, Asia. It on-the-fly clusters a certain search engine's search results into different groups, and provide meaningful and readable names for these groups. SRC changes the traditional representation of search results into a non-linear way, so as to facilitate user's browsing.
Traditional clustering techniques don't work for this problem because the documents are short, the cluster names should be readable and the algorithm should be efficient for on-the-fly calculation. Our method take the whole problem in another way and overcome the difficulties in traditional clustering method. Basically, we try to first identify salient topics by identifying distinct and independent keyword, and then classify the search results into these topics.
The following is the corresponding paper to this technology:
Hua-Jun Zeng, Qi-Cai He, Zheng Chen, and Wei-Ying Ma. Learning To Cluster Web Search Results. In Proceedings of the 27th annual international conference on research and development in information retrieval (SIGIR'04), pp. 210-217, Sheffield, United Kingdom, July 2004. [
][
]
It's all about tools, baby...
The Yahoo! User Interface (YUI) Library is a set of utilities and controls, written in JavaScript, for building richly interactive web applications using techniques such as DOM scripting, DHTML and AJAX. The YUI Library also includes several core CSS resources. All components in the YUI Library have been released as open source under a BSD license and are free for all uses.
Education Commons is a virtual community of academic systems users, designers and systems implementers sharing knowledge, experiences and best practices.
The goal of the community is to create an open and transparent system of communication between diverse groups committed to advancing the state of education worldwide. It's meant to be a virtual commons, where sharing and participation are key. We encourage you to contribute your thoughts, ideas, programs and projects.
It's all about tools, baby...
The Yahoo! User Interface (YUI) Library is a set of utilities and controls, written in JavaScript, for building richly interactive web applications using techniques such as DOM scripting, DHTML and AJAX. The YUI Library also includes several core CSS resources. All components in the YUI Library have been released as open source under a BSD license and are free for all uses.


