Copyright © 2006 Together Teamlösungen EDV-Dienstleistungen GmbH
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the Together Teamlösungen EDV-Dienstleistungen GmbH.
Together Teamlösungen EDV-Dienstleistungen GmbH DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Table of Contents
Table of Contents
Together Search Server
Preview application
Application improvements:
Optimized parsing procedure for certain file formats.
Search application
Application improvements:
Minor bug fixes.
Table of Contents
Together Search Server
Preview application
New feature:
Added Office downgrade actions in context menu for Microsoft Office files.
Search application
New feature:
Added Office downgrade actions in context menu for Microsoft Office files.
Table of Contents
Together Search Server
Preview application
Application improvements:
Updated WebDav implementation and improved document management actions.
Search application
New feature:
Added parameter 'Snapper/ShowPreview' - should previewer be shown with search result (default: false).
Application improvements:
Updated WebDav implementation and improved document management actions.
Table of Contents
Together Search Server
Preview application
New feature:
Added PDF actions in context menu.
Search application
New feature:
Added PDF actions in context menu.
Table of Contents
Together Search Server
Preview application
Bug fix:
Fixed bug with parsing certain PDF files.
Table of Contents
Together Search Server
Preview application
New feature:
Added functionality for embedding file descriptor from WebDav into result XML of previewer.
Table of Contents
Together Search Server
Admin application
Bug fix:
Fixed bug with parsing certain files during indexing.
Table of Contents
Together Search Server
Preview application
Application improvements:
Improved temporary folder creation functionality.
New feature:
Added parameter 'Previewer/MaxFileSize' - maximum size of file in KB to preview (default:0 - no limit).
Added parameter 'Previewer/TempDir' - path to root directory for creating temporary folders.
Table of Contents
Together Search Server
Preview application
Bug fix:
Fixed bug with date format conversion.
Table of Contents
Together Search Server
Preview application
Bug fix:
Fixed bug with sheet order in certain Excel 2003 files.
Table of Contents
Together Search Server
Preview application
Bug fix:
Fixed bug with preview of some Word 2003 and older format files.
Table of Contents
Together Search Server
Admin application
Application improvements:
New Office 2007 parsers introduced (faster indexing of Word, Excel and PowerPoint 2007 files).
Preview application
Bug fix:
Fixed bug with downloading files from https locations.
New feature:
Double click on previewer title bar opens first action from context menu.
Application improvements:
New Office 2007 parsers introduced (faster preview of Word, Excel and PowerPoint 2007 files).
Search application
New feature:
Added embedded preview frame in search result page(click on document in search result list opens its preview).
Added multiple files compressed download from search result page.
Double click on document in search result list opens first action from context menu.
Table of Contents
Together Search Server
Preview application
Bug fix:
Fixed bug with path for document management actions.
Table of Contents
Together Search Server
Admin application
Added support for files with following extensions:
Microsoft Word 2007: docm, dotm, dotx.
Microsoft Excel 2007: xlsm, xltm, xltx.
Microsoft Power Point 2007: potm, potx, ppsm, ppsx, pptm.
Preview application
Added support for files with following extensions:
Microsoft Word 2007: docm, dotm, dotx.
Microsoft Excel 2007: xlsm, xltm, xltx.
Microsoft Power Point 2007: potm, potx, ppsm, ppsx, pptm.
Table of Contents
Together Search Server
Search application
Application improvements:
Improved ContextMenu functionality on Search Result page.
ActiveX components updated to version 1.0.21.
Preview application
Application improvements:
Improved ContextMenu functionality on Preview page.
ActiveX components updated to version 1.0.21.
Table of Contents
Together Search Server
Search application
Removed fields for Meta Data search.
Application improvements:
Improved support for multi language templates.
Table of Contents
Together Search Server
Search application
New functionalities
Search functionality is now implemented through WebDav.
Application improvements
Document management actions are now available through context menu on single or group of documents in search result page.
Added new keyboard actions on search result page (Page Up,Page Down,Home,End,Enter,Ctrl+C...)
ActiveX components updated to version 1.0.19.
Admin application
New functionalities
New parameters in siteConf.xml introduced:
1.PREVIEWROOT (if empty, default preview option will be used - request will be sent to Preview application).
2.DOWNLOADROOT (if empty, download option will be disabled).
Application improvements
Bug fix: Solved problem with parsing some Word files that failed or produced duplicated text as output.
Preview application
Application improvements
ActiveX components updated to version 1.0.19.
Bug fix: Solved problem with parsing some Word files that failed or produced duplicated text as output.
Table of Contents
Together Search Server
Search application
New functionalities
Added 'Reset' button for refreshing search form without refreshing entire search page.
Added 'Key' and 'Value' fields for searching document's metadata.
Application improvements
Improved document management actions.
Searching date and time precision is now configurable, based on 'TimeResolution' parameter.
Folders from search result page are now opened in previewer only if inside container file and otherwise as OS folder.
Admin application
New functionalities
Admin forms completely redesigned.
Sites that are being indexed/optimized are now marked and actions are disabled while selected action is in progress.
Application improvements
Added 'TimeResolution' parameter for configuring date and time precision in index fields.
Preview application
Application improvements
Improved document management actions.
Improved encoding support (‘+’ in document names)
Table of Contents
Together Search Server
All applications
Updated libraries:
PDFBox (version 0.7.4)
POI (version 3.0.2-FINAL-20080324)
Lucene (version 2.3.1)
Search application
New functionalities
Added download action on search result page.
Directory search functionality added.
Application improvements
Document management actions improved.
Search suggestion optimized.
Single and multiple document selection for document management actions is now performed by selecting document on search result page (selected documents are marked).
Numerous layout improvements.
Admin application
New functionalities
Detection of password protected documents added.
New parameter introduced: 'Snapper/FlushAfterAdd' - should index be flushed after each added document (default: 'false').
Application improvements
Default value of 'IndexDirectory' parameter changed to 'true'.
'RelativeIndexPath' is now configured in SiteConf.xml for each site (value is based on presence of MappingRoot parameter - if MappingRoot is set,RelativeIndexPath is set to 'true').
Preview application
New functionalities
Detection of password protected documents added.
Numerous layout improvements.
Table of Contents
Together Search Server
Search application
Application improvements
Improved support for document management actions.
Improved multilanguage support.
Preview application
Application improvements
Improved multilanguage support.
Table of Contents
Together Search Server
Admin application
Index/Update procedure modified
New field added to index - HasAttachments ('true' if mail has attachments, 'false' otherwise).
Search application
New design
New searcher design introduced.
Search forms modified
Document group reorganized.
Suggestion support optimized.
More information added to search results.
Preview application
New parameters intoduced
"Previewer/DatePattern" - which form of date will be used in Preview application (default: 'dd.MM.yyyy').
"Previewer/Picture/ShowMonochrome" - should images and PDF pages be shown in full color or black-white (default: 'true' - black and white).
New design
New previewer design introduced.
New functionalities
Translate option optimized.
Added support for translating content of a file inside a container.
PDF preview is now image based.
New functions for image and PDF preview added: Zoom In, Zoom Out, Rotate Clockwise, Rotate Counterclockwise, Fit, Fit Width, Fit Height.
New function for archive file preview added: Navigation through folder structure via folder tree.
Table of Contents
Together Search Server
Admin application
Index/Update bug fixed
Problem with deleting index on next index iteration start solved.
Table of Contents
Together Search Server
Preview application
New parameters intoduced
"Previewer/WebDavURLs" - server url (multiple paths can be used, separated by "," - for example: http://server1,http://server2,http://server3).
"Previewer/WebDavUsernames" - username for authentication on specified server (multiple usernames can be used, separated by "," - for example: username1,username2,username3).
"Previewer/WebDavPasswords" - password for authentication on specified server (multiple passwords can be used, separated by "," - for example: password1,password2,password3).
"Previewer/UseNTLM" - should NTLM authentication be used on specified server (multiple values can be used, separated by "," - for example: true,true,false).
Order of parameters determine on which server they will be applied - for example: "username1","password1" and "true" will be used for "http://server1", "username2","password2" and "true" will be used for "http://server2" and "username3","password3" and "false" will be used for "http://server3".
All these parameters are initially commented out.
Table of Contents
Together Search Server
Preview application
New parameter intoduced
"Previewer/InsideContainerHTMLPreview" - should files in container be shown as HTML or plain text (default value:"true" - HTML preview will be used)
Parameter "Previewer/ContainerFilesPreviewLimit" default value changed
New default value is 5
Table of Contents
Together Search Server
All applications
New version of POI intoduced:
Version 3.0.1-FINAL-20070705
XSLTC bug prevention:
Using XSLTC is now disabled
Admin application
WebDav support
Added support for indexing files via WebDav with and without NTLM authentication
New option introduced on Admin Paths form: 'Use NTLM authentication' (checkbox)
PPT 95 support
Added support for indexing PowerPoint 95 files
Excel parsing problem fixed
Problem with creating records is now handled (Warning is given, and parsing continues)
Search application
WebDav support
Added support for downloading files via WebDav with and without NTLM authentication
New file download functionality
Files from containers (MSG,ZIP...) are now downloaded directly
Preview application
WebDav support
Added support for downloading and viewing files via WebDav with and without NTLM authentication
New file download functionality
Files from containers (MSG,ZIP...) are now downloaded directly
PPT 95 support
Added support for viewing PowerPoint 95 files
Excel parsing problem fixed
Problem with creating records is now handled (Warning is given, and parsing continues)
Table of Contents
Together Search Server
All applications
Bug fix:
Fixed bug with missing sent and received dates in .msg files.
Table of Contents
Together Search Server
Search application
Bug fix:
Fixed bug with missing scroll bar on search result form.
Table of Contents
Together Search Server
Admin application
Added new functionalities:
Option for alternative .docx to text transformation
Modified option for alternative .docx to html transformation
Option for alternative .xlsx to text transformation
Modified option for alternative .xlsx to html transformation
Option for alternative .pptx to text transformation
Modified option for alternative .pptx to html transformation
XSL transformation procedure for Office 2007 files optimized
New parameters intoduced:
'Snapper/Word2007TextTransformationPath'
'Snapper/Word2007HTMLTransformationPath'
'Snapper/Excel2007TextTransformationPath'
'Snapper/Excel2007HTMLTransformationPath'
'Snapper/PowerPoint2007TextTransformationPath'
'Snapper/PowerPoint2007HTMLTransformationPath'
Search application
Added new option for mail search:
Search for mails thar are sent OR received in the period between entered dates.
Preview application
Added new functionalities:
Option for alternative .docx to text transformation
Modified option for alternative .docx to html transformation
Option for alternative .xlsx to text transformation
Modified option for alternative .xlsx to html transformation
Option for alternative .pptx to text transformation
Modified option for alternative .pptx to html transformation
XSL transformation procedure for Office 2007 files optimized
Showing translate bar on preview page is now optional
New parameters intoduced:
'Previewer/Word2007TextTransformationPath'
'Previewer/Word2007HTMLTransformationPath'
'Previewer/Excel2007TextTransformationPath'
'Previewer/Excel2007HTMLTransformationPath'
'Previewer/PowerPoint2007TextTransformationPath'
'Previewer/PowerPoint2007HTMLTransformationPath'
'Previewer/TranslatorConnectionString'
'Previewer/TranslatorContentString'
'Previewer/TranslatorLangpairString'
'Previewer/TranslatorContentStartString'
'Previewer/TranslatorContentEndString'
'Previewer/ToShowTranslateBar'
Table of Contents
Together Search Server
Admin application
Added support for Office 2007 file formats (docx,pptx and xlsx).
Added new functionalities:
Optimize index
Optimize all indexes at once
Optimize all indexes one by one
New parameters intoduced:
'Snapper/Parser/Excel2007/ConverterClassName'
'Snapper/Parser/Excel2007/ConverterClassName'
'Snapper/Parser/PowerPoint2007/ConverterClassName'
'Snapper/Parser/Word2007/ConverterClassName'
'Snapper/Word2007TransformationPath'
'Snapper/Excel2007TransformationPath'
'Snapper/PowerPoint2007TransformationPath'
'Snapper/SaveConvertedWord2007'
'Snapper/SaveConvertedExcel2007'
'Snapper/SaveConvertedPowerPoint2007'
'Snapper/ParserCharacterLimit'
'Snapper/CharacterLimitForParser'
'Snapper/ParserPageLimit'
'Snapper/PageLimitForParser'
'Snapper/MaxFileSize'
'Snapper/TimeLimit'
Effect of parameter 'Snapper.MaxIndexLength' changed:
Now it is used for setting max number of files that will be indexed/updated in one index/update procedure. If index is started in ReIndex mode,next iteration will start from last file from previous iteration.
Search application
Error caused by 'TooManyClauses' exception is now handled properly.
Preview application
New parameters intoduced:
'Previewer/Parser/Excel2007/ConverterClassName'
'Previewer/Parser/PowerPoint2007/ConverterClassName'
'Previewer/Parser/Word2007/ConverterClassName'
'Previewer/JBIGConverterPath'
'Previewer/TimeLimit'
'Previewer/ContainerFilesPreviewLimit'
'Previewer/Word2007TransformationPath'
'Previewer/Excel2007TransformationPath'
'Previewer/PowerPoint2007TransformationPath'
Changed default value of parameter 'Previewer/PageLimitForParser' to 'ppt,pdf,docx,xlsx,pptx'.
Changed default value of parameter 'Previewer/CharacterLimitForParser' - added setting for Word (doc).
Changed default value of parameter 'Previewer/FilesInContainer/toPreview' to 'true'.
Table of Contents
Together Search Server
Admin application
- New application parameter are introduced :
'Snapper/CheckConnection' - for database connection check before performing query (default : 'true')
Deleting content from index during update procedure optimized.
Include list and metadata list reading optimized.
Preview application
- New application parameter are introduced :
'Previewer/TimeToSleep' - time (in minutes) that cleaning thread will wait before next Preview Temporary files deletion (default : 1440=24h).
Added operation for deleting temporary files created during preview operations.
Table of Contents
Together Search Server
Admin application
- New application parameters are introduced :
'Snapper/OptimizeOnIndex' - should index be optimized after index procedure ends( default : 'true')
'Snapper/OptimizeOnUpdate' - should index be optimized after update procedure ends( default : 'false')
Added new logging information for index optimizing procedure, based on 'OptimizeOnIndex' and 'optimizeOnUpdate' parameter values.
Table of Contents
Together Search Server
Admin application
Added new logging information for index/update procedure. Massages for start and finish of index/update operations are now logged. Total time of query execution for reading metadata from database is also logged.
Table of Contents
Together Search Server
Admin application
Added new functionality for index/update procedure. Since this version, time of last index/update is set to beggining of corresponding operation, so all files added or updated during the operation will also be indexed/updated. Operation start time is stored in a file in index directory.
Table of Contents
Together Search Server
Admin application
- Database reconnection attempt added.
- Option for reading special characters in filenames added.
- Current operation logging added(reading metadata list, include list...)
Preview application
- Zoom-in or zoom-out option removes header from image preview page.
Table of Contents
Table of Contents
Together Search Server
Admin application
- New application parameter is introduced :
'Snapper.Parser/word/ConverterClassName'
Together Document Viewer
- New application parameter is introduced :
'Snapper.Parser/Word/ConverterClassName'
Table of Contents
Together Search Server
Admin application
- Support to index WebDAV file metadata : 'created' and FTP file metadata : 'owner' are introduced.
- DODS removed.
- Support to index the 'Author' and 'Last saved By' compound file metadata(Word, Excel and Power Point documents), E-Mail metadata( 'From', 'To', 'CC', 'BCC' and 'Subject'), according to this new fields in Lucene index are introduced.
Search application
- Result page of search shows search parameters and fill search input field and sites search boxes again.
- E-mail search is introduced, search over E-mail specific properties : 'From', 'To', 'CC', 'BCC' and 'Subject'.
- Support of sorting result of e-mail search : 'by newest/oldest sent/received E-mail files' and 'by From/To E-mail is sent'.
- Introduction of advanced search through file metadata : 'Author' and 'Last saved By' .
- New look is introduced.
Parser
Power Point, Word and Excel parser, support to extract compound file metadata ('Author', 'Last saved By' ...).
Table of Contents
Together Search Server
Admin application
- Support to index WebDAV.
- Reconstruction of index FTP. Support to index subfolders.
- Introduction the size of the file (New field in index).
- Introduction the owner of the file (New field in index).
- New application parameter is introduced :
'Snapper.Parser/PowerPoint/ConverterClassName'
Search application
- Introduction of search through file metadata : created, accessed, owner.
- Introduction of search through sent/received dates of e-mails.
Together Document Viewer
- New application parameter is introduced :
'Snapper.Parser/PowerPoint/ConverterClassName'
Table of Contents
Together Search Server
Admin application
- Support for modification of configuration files.
- Support for index process without 'TAS' running.
- New application parameters are introduced :
- 'Snapper.Parser/Excel/ConverterClassName'
- 'Snapper.SaveConvertedFile'
- 'Snapper.PathOfConvertedFiles'
according to this new functionality is added. Pure text of excel file is stored in index and ability that converted text if is in HTML form be saved to file system as html document.
Search application
- Reread configuration files every x minutes(according to this, new application parameter 'Snapper.ReReadConfigFilesEveryMinutes' (defined in application configuration file - 'web.xml') are introduced).
- Introduction of search suggest logic.
Together Document Viewer
Parse/display containers files(Archive) with subpath of contents (zip and tgz), subpaths as "directory".
Picture "zoom" links in preview of pictures is added.
Ability to display content of the Excel files separately of indexed data if converted 'html' file exist on file system.
Reread configuration files every x minutes(according to this, new application parameter 'Snapper.ReReadConfigFilesEveryMinutes' (defined in application configuration file - 'web.xml') are introduced).
Table of Contents
EML Parser
Additive properties (Signed, Priority, Read Receipt Requested, Delivery Receipt Requested, Expires and Sensitivity) are introduced.
MSG Parser
Additive properties (Signed, Priority, Read Receipt Requested, Delivery Receipt Requested, Expires and Sensitivity) are introduced.
Note : Extraction of headers (Signed, Priority ...) from 'msg' file is supported only if they exist in coresponding file.
Additional information about files (creation time and last accessed time) are introduced.
Acording to this, new application parameters 'Snapper.IndexOSspecific' (defined in application configuration file - 'web.xml') are introduced.
Table of Contents
Additional information about files (creation time and last accessed time) are introduced.
Acording to this, new application parameters 'Snapper.IndexOSspecific' (defined in application configuration file - 'web.xml') are introduced.
Table of Contents
Excluded 'Enhydra Zeus' generated java source files, for generation and validation of 'xml' files 'Together Search Server' and 'Together Document Viewer' using 'XMLBeans' (Apache XML project).
Lucene
Version. 2.0.0 is introduced.
Excluded support for index data into 'data base'. Acording to this, application parameter 'Snapper.IndexType' (defined in application configuration file - 'web.xml') is expeled.
New functionality to index only 'MetaData' is introduced.
Acording to this, new application parameters 'Snapper.DocumentUpdate' and 'Snapper.DocumentUpdatePattern' (defined in application configuration file - 'web.xml') are introduced.
Acording to this, new file type ('NULL') and 'Document Group' ('Meta Data') is introduced.
DocBook stylesheets release: 1.70.1 is introduced.
Table of Contents
Enhydra Snapper without database dependency.
Snapper Admin
- New application parameter 'Snapper.DocumentGroupConfFile' (defined in application configuration file - 'web.xml') is introduced.
- New application parameter 'Snapper.SiteConfFile' (defined in application configuration file - 'web.xml') is introduced.
- New application parameter 'Snapper.StatisticActive' (defined in application configuration file - 'web.xml') is introduced.
- New application parameter 'Snapper.StatisticDirectory' (defined in application configuration file - 'web.xml') is introduced.
Snapper
- New application parameter 'Snapper.DocumentGroupConfFile' (defined in application configuration file - 'web.xml') is introduced.
- New application parameter 'Snapper.SiteConfFile' (defined in application configuration file - 'web.xml') is introduced.
- New application parameter 'Snapper.StatisticActive' (defined in application configuration file - 'web.xml') is introduced.
- New application parameter 'Snapper.StatisticDirectory' (defined in application configuration file - 'web.xml') is introduced.
Snapper Previewer
- New application parameter 'Snapper.DocumentGroupConfFile' (defined in application configuration file - 'web.xml') is introduced.
- New application parameter 'Snapper.SiteConfFile' (defined in application configuration file - 'web.xml') is introduced.
Table of Contents
Snapper Admin
New application functionality, Re - index mode (ability to continue with indexing).
New application parameter 'Snapper.Indexer.ReIndexMode' (defined in application configuration file - 'web.xml') is introduced.
Table of Contents
Table of Contents
Table of Contents
Snapper Logging
New class 'MonologLoggingManager' is introduced.
Snapper Admin
New application parameter 'Snapper.MaxPropertiesLength' (defined in application configuration file - 'web.xml') is introduced.
Table of Contents
PDF Parser
Removed extraction of title.
Snapper
New application parameter 'Snapper.ResultDatePattern' (defined in application configuration file - 'web.xml') is introduced.
URL parameter 'resultDatePattern' is introduced. If defined, it overrides application parameter 'Snapper.ResultDatePattern'.
URL parameter 'datePattern' is renamed to 'searchDatePattern'.
Table of Contents
Snapper
New application parameter 'Snapper.SearchDatePattern' (defined in application configuration file - 'web.xml') is introduced.
URL parameter 'datePattern' is introduced. If defined, it overrides application parameter 'Snapper.SearchDatePattern'.
Table of Contents
Table of Contents
Snapper Previewer
AbsoluteFilePath element of resulting XML has more structural approach.
Configuration parameters are overrided with URL parameter settings.
Table of Contents
Implemented possibility that indexer can add document metadata from metadata database in index content. New Snapper Admin parameter introduced:
Indexer.MountMetaDataInContent
Snapper supports google search. New parameters are introduced:
GoogleSearcherURL
GoogleSearcherKey
GoogleResultLimit
• New Snapper Previewer parameters are implemented:
ParserPageLimit
PageLimitForParser
ParserCharacterLimit
CharacterLimitForParser
for document parser limits.
Document group
New document types and document group.
Parsers
Detection of not parsed files.
Snapper Previewer
Introduction of translation parsed contet.
Introduction of Google search result in Snapper Searcher.
Table of Contents
Table of Contents
Improved log
New Snapper Admin parameters
Indexer.MountFilePathInContent
Setting it to true will put file path in content
Indexer.MountPropertiesInContent
Setting it to true will put properties in content
New Snapper search application parameter
Snapper.PreviewURL
URL used for creating document preview link - represents snapper previewer application URL.
Document group
Document types can be grouped for quick search
New indexing options
Directory indexing
Content indexing
Parsers
File type mapping
It is possible to map additional type to file parser, e.g. map .properties with text parser.
Word and Excel parser
Removed extraction of title.
New enhydra application - Snapper Previewer
XML based document preview, containing content and other relevant data of the document.
Preview for files from file system.
Document properties, title, filepath are removed from document content in a preview.
Table of Contents
Improved logging in snapperAdmin application (explanation why document was not parsed).
Excel Parser
- Resolved problem that some regular files that was not parsed.
New Application parameter 'Indexer.MountTitleInContent' (SnapperAdmin application).
Parameter SimpleSearch.Type in web.xml of Snapper application is not supported any more. Because of that search.xsl and searchResult.xsl files were changed, also http request parameter for Search.po object 'typeOfSearch' is removed(Simple and Advanced search options are merged).
Improved time of search.
Experimental indexing into Data Base.
New Application parameter 'IndexType' (SnapperAdmin and Snapper application).
Table of Contents
Fixed problem with indexing FTP files.
Application parameter RelativeIndexPath (SnapperAdmin and Snapper).
Application parameter FileSeparator (SnapperAdmin and Snapper).
Table of Contents
Configurable Simple search, search across title of the document and content of the document or only in content of the document. Defined in web.xml file of snapper searcher (parameter SimpleSearch.Type).
Changed search.xsl and searchResult.xsl files to support configurable Simple search.
Table of Contents
Snapper documentation is changed to DocBook format. It is shipped in HTML and PDF format.
Solved a ConnectionAllocation problem
Solved a possible problem between DODS, Enhydra and Snapper when indexing in threads
New authentication type: Tomcat authentication
Additional build targets in build.xml
Table of Contents
Solved a possible problem between DODS, Enhydra and Snapper when indexing in threads
Increased the speed of index by optimizing database calls
Table of Contents
Solved the memory leak problem
Repaired a small bug involving not indexed files presentation
Table of Contents
Changes on the SiteList page. Links instead of buttons of IndexAll
Small bugs by indexing include list removed
Table of Contents
Documented and tested Metadata feature of application
Smaller changes in the look of HTMl pages
Refactoring and solving smaller bugs
Table of Contents
Index All functionality: all sites at once, or one by one
Column name in the NotIndexed table changed from FILE to FILENAME due to MSQL constraints
Included scripts for all DODS supported database servers
Table of Contents
When filled out Index Dir attribute for a Site is the *exact* location of the index
Include/Exclude list as Site attributes
Application parameter Snapper.LogicalNameFromDatabase - 0/no, 1/yes: use metadata's DocumentLogicalName field
Application parameter Snapper.DocumentLogicalName - Metadata field to use as document title
Application parameter Snapper.DBFetchSize - maximal DB fetch size
Other extensions indexing ("other" checkbox/attribute for a Site)
New Menu look
Threaded indexing
Number of documents per Site are displayed on the indexing history page
Multiple real-time checks using XMLRequest when creating a Site
Search results XML includes 'configuration' section with all information about Sites
Search results XML includes all document paths (relative, absolute...)
Search results XML includes 'searched parameters' section containing all request parameters
Using Zeus created classes to manipulate results XML files
Snapper Search application is made up of one presentation object: Seach.po
Search.po object uses request parameters to form an XML file defined by a search.dtd
The po object uses an xsl file defined in xsl request parameter to transform the xml file and display the results
Three xsl files as examples: search.xsl, advancedsearch.xsl, searchresults.xsl
New version of PDFBox (0.7.1) included
Table of Contents
Application split into two applications (wars): snapperAdmin, snapper (search)
Sorting of search results (Sorting types : by relevance , by newest modified files and by oldest modified files)
DB Parameter "Search" as an attribute of Site, indicating a Site to be automatically included in search
History of indexed sites: start time, stop time, length, type (index)
History log of unindexed files per site with the possibility of file downloading
Searching for property values (key=value)
Search results page displays a "new search" dialog
DB Parameter "IndexDir" as an attribute of Site. If not enteres the default: "Snapper/IndexDir" in web.xml is used for indexing a selected site
DB Parameter "IndexDir" as an attribute of Site
Searching by document type
Sample filterDB for filtering documents not to be indexed during indexing.
Loging dialog for snapperAdmin. By default, username: admin password: snapper
Capability to update Site DB attributes - Update Site window
Search results page displays a Site as well to which the file belongs
Modular desing, separate modules for: API, Parsers, Indexer, Searcher, Kernel, Logging, Util
Table of Contents
Solved "OutOfMemory" exception while parsing Excel file(s)
Added support for PPT and PPS file types
Results page, bottom: current page link not displayed
Table of Contents
First release!
API (SnapperCore) released
Database released
Indexing Implementation. Full indexing of a site (delete if exists and recreate)
Site management Implementation: adding a new site, defining maximal size and age of files to be indexed as well as file types being indexed
Site-Path management Implementation. Adding paths to a Site and deleting paths
Three path types and therefore three indexing methods/protocols implemented: local FileSystem, UNC, FTP
Basic search Implementation. Complex searches "AND", "OR" "title:" etc. implemented
Advanced search Implementation. Searching for title, modification date, custom parameters (Ms Word files) implemented
Configurable XSL transformations for search results. Configuration of XSL transformation can be performed in runtime by changing the configuration file (see documentation)
Lucene indexing engine version 1.4.3 implementation
LuceneIndexer wrapper for Lucene's IndexWriter
LuceneSearcher wrapper for Lucene's IndexSearcher
LuceneReader wrapper for Lucene's IndexReader
Implemented parsers for following file types: Plain text (.txt, .java. ini), MS Outoolok Express (.eml), MS Word (.doc), MS Excel (.xls), Open Office Writer (.sxw), Open Office Calc (.sxc), HTML (.html, .htm), Adobe Portable Document Format (.pdf), Rich Text Format (.rtf).
Basic indexing statistics: number of documents a site contains, date and time of last indexing
Basic search statistics: number of hits (searches) for a site
Subfolder indexing for UNC and FileSystem
Mapping of site paths for FileSystem, UNC and FTP
Trigger event (logging a message) after the index size has crossed the given size (MaxIndexLength parameter in web.xml) during indexing
Enabling file download on the search results page trough a "Download" parameter in web.xml
Complex search explanations on both (basic, advanced) Search pages
Number of search results per page can be selected from a combo box on the search page (10, 20, 30...)
web.xml configurable parameters
Might get an error after path is deleted and new search over the site containing the path is performed
Parser constraints:
Some MS Word documents cannot be parsed (due to complex entries)
Some MS Excel documents cannot be parsed (due to complex entries)
RTF files created within MS Word cannot be parsed (not a clean RTF)
PPT, PPS parsers in beta testing, not released
Some Firefox browser issues