Minimum Project Technical Standards
From the Alberta Heritage Digitization minimum standards
The AHDP employs standards consistent with academic and industry practice for digitization of paper documents. The industry standard TIFF format with LZW compression is used for high-quality archival digital images. Documents are scanned at original size at 300 dpi in 8-bit greyscale mode to preserve as much of the original document information as possible. When documents are in colour, 24-bit RGB scans are made.
The workflow for digitization consists of the following:
Equipment and Software Used
|1||Letter - Describes the general collection to which the item belongs (handles 26 collections)|
|2-3||Alphanumeric - Identifies the specific item in the collection (handles 1,296 items)|
|4||Letter - Describes the type of file (A - size 1 display, B - size 2 display, etc.)|
|5-8||Numeric - Identifies the specific page of the document (handles 10,000 pages)|
|9-11||Letter - standard MIME extension for the file format|
The AHDP uses the first three characters to provide persistent access to the item
on its site.
Notes: DC_Subject and DC_Description are currently not used.
Currently the AHDP uses a database structure that tracks administrative and descriptive meta data for item and page level information in relatively simple table structures. Additional tables are utilized for controlled vocabularies but all information at either the page or item level can be exported to a comma-delimited format. The core database engine for the AHDP is the Microsoft Database Engine (MDE); the project utilizes both Microsoft Access and SQL Server for the operations.
One of the guiding principles for the AHDP is to provide a simple interface to the resources. Complex client-side scripting and the usage of plug-ins is avoided where possible. Pages are created using standard HTML 4.0 code. The majority of the scripting and validation occurs at the server level.
Web Site Auditing and Evaluation
Complete logs of server activity are maintained by the AHDP.
Programming and Scripting Languages
Currently, the AHDP employs Active Server Pages 3.0 for its server side scripting. Cookies have been avoided in the past over concerns of privacy and lack of support at the client side (whether by choice or because of incompatible clients). As stated above, client side scripting is avoided where ever possible to ensure that the majority of users will have little to no problems accessing the site.
Preservation and Records Management
The AHDP currently uses CD-R technology to store digital files for long term storage and employs only standard file formats (TIFF, JPEG, HTML) that are open and non-proprietary for all of its work. CD-Rs are written using standard ISO-9660 format to ensure maximum compatibility.
The general principles that the AHDP employ to ensure long term access to the media is to create CRC-32 values for individual files and to do spot checks on the media on a regular basis. Duplicate copies of media are made to store with one on-site and one off-site copy. In addition to backing up the database information separately, the AHDP plans to save the meta data with the files in XML format once the template has been developed.