Biblio Tech
Review
Information Technology for Libraries

Data Magician

Search BTR site

Receive update alerts


BTR Home


1999 Issues

[October 99]
[
Summer 1999]
[
May 99]
[
April 99]
[
March 99]
[
February 99]
[
January 99]


This month’s stories

[Aleph Sweden]
[
Sirsi]
[
PICA & OCLC]
[
OLIB7 new sites]
[
Spanish Innovations]
[
Data Magician]

Data Magician - conversion without programmers?

Data conversion is often more about understanding waht needs to be done than writing code.  Data Magician allows non-programmers to specify conversion tasks and keep control of exactly what goes on - how well does it do?     Peter Evans reviews.

Many of the data manipulation and conversion programs around are not tuned to the specific needs of library and bibliographic data.  Very few can handle MARC records. Data Magician is authored by Lawrence Folland a Canadian who has long experience in the field of bibliographic file manipulation and the program has excellent MARC record features as a result.

The aims of DM are clear enough - to provide a tool for the manipulation of data files from one format to another - the sort of thing that a major vendor will do as part of conversion from one system to another.  But often, you may have some odd files around in one format or another and want to load them into another system.  Data magician is an ideal tool for this job - it requires no programming skills - although familiarity with file structures and handling data will be very useful.

Conclusions
Once you have become familiar with the DOS interface, DM does a very efficient job in an unfussy way. The powerful processing codes make very complex tasks possible which you might otherwise need an expensive programmer to achieve.  The ability to handle MARC records is a of course a must for this sort of software in a library environment and DM does this without any problems.

Cost $195 Canadian from Folland Software Services Inc.

The manual is very clear and comprehensive with numerous examples explained. Installation is easy and DM provides a set of pre-defined "settings" files which handle a set of 18 common conversions - including DOBIS and Geac Marc records to standard USMARC.

The user interface is the once familiar DOS style of hierarchical menus and actions selected by moving through options and selecting via the enter key. It is remarkable how old fashioned this appears now - but for this sort of application, it is no real hindrance.

The procedure for using DM is basically two steps - creating a conversion specification and then running the program against an input file.  Creating the specification logically requires you to create a spec for the input file i.e. tell DM how the incoming data is laid out and also an output spec to say how the data should be formatted for the output file.

There are 8 basic input file types covering ASCII delimted, dbase, MARC, Tagged files INMAGIC, etc (see full list).  There are options to set repeatable field formats and end of record characters etc. You can also set Global field processing codes which can, for example, trim off unwanted characters like line numbers before the start of data proper. These sort of features allow DM to be used for transcripts of on-line sessions where you may need to inspect lines for remote system prompts etc.

Depending on the type of file being processed, so there will be different options in specifying the input file format.  MARC records can be described tag by tag e.g. 245 = Title and any local tags can be defined as required. Leader information codes (status, encoding level etc.) are predefined with simple so that there is no need to delve into character positions for this data.

Once you have described the input file, you go on to specify which input fields are mapped to which output fields and, via the processing codes, how they are massaged on the way through.

Within the processing codes lies the true power of DM. Field data can be broken out on the occurrence of specific characters and strings. Data can be deleted, copied to other fields, and have punctuation etc. added to it. Powerful conditional processing allows you to decide what happens when specific data strings are encountered.  Quit options enable a field, record or the whole process to be abandoned in specific circumstances.

Nice features include the ability to number records automatically, and convert dates.

Once the data output has been set up, you then run the program against the input file and watch the counter tick round.  Batch control parameters are good.  You can set up log files, set a start number and number to process so that a file can be tested and processed in chunks easily.

Practicing on a file of fairly brief MARC records, I was processing about 600 per minute - so you could easily process a moderate sized catalogue of about 100,000 records in about 3 hours.

The manual is excellent with lots of examples to make the job of creating a settings file easier.  Mostly you can work from the sample files and modify them as required.