Product/Service

ParseRat, file parser and converter and re-organizer

Source: Guy Software
ParseRat: file parser, and converter and re-organizer
Extract usable data from just about anything:
  • Fixed format lines up to 8000 bytes line length.
  • Delimited Files - pretty much any delimiter and up to 500 fields.
  • dBase files (up to 500 fields).
  • Binary files: Many binary files consist of a header followed by a series of fixed-format records. ParseRat allows you to skip the header (if any) and export fields from the following records - with translation of binary numbers if needed. (Ability to process "Packed Decimal" - COBOL Comp-3 type and "Big Endian" - Motorola pattern as well as "Little Endian" - Intel pattern binary numbers added with version 1.0h).
  • EBCDIC (added with v 1.0k): ParseRat can read EBCDIC files, typically used in mainframes. It can correctly handle packed, zoned and binary numbers garbled as a result of character-by-character conversion of EBCDIC files by file transfer tools such as IND$FILE and Kermit.
  • Page Image Files, single or multiple records per page, with or without headers and footers, fixed format blocks or tag/flag-defined fields.
  • HTML files and captured/saved web pages: ParseRat's ability to define single or multi-line blocks based on tags and to extract fields within them also based on tags makes it ideal for extracting data from structured web pages and HTML files (e.g. convert your NetScape bookmark or address file to a database or mailmerge file).
  • Take files posted as a result of users completing HTML forms on web pages and convert them into databases.

Your proprietary library package won't export your collection data to dBase? No problem, just have it print catalogue cards to a print file. ParseRat will read the print file and extract the data for you.

Filter the data to a different format:

  • Change date and time formats. Read foreign language dates. Extract integer Year, Month, Day, Hour, Minute, Second. Generate Day of Week. Create four digit years from two digit years with intelligent century choice for Year 2000 (Y2K) conversions. Translate to and from Julian Day Numbers (days since 4713 B.C.) with or without fractional days. Translate "seconds since 1970" date/times.
  • Recognise decimal, hexadecimal and scientific format numbers in input fields.
  • Export numeric data as decimal, hexadecimal and scientific formats. With or without leading zero file for fixed length fields.
  • Split and combine input fields.
  • Reassemble and re-parse improperly parsed groups of input fields produced by other products.
  • Analyse name information. Parse out names into title, first name, middle names, last name, name suffix components. Get usable mailing names from those telephone list CD-ROMs by cutting out noise phrases like "Teen Phone", "Fax Line", "Residence" etc. Correctly recognise multi-word surnames like "Van der Pohl" and "de la Salle".

    The CD-ROM you bought has names like "de la Mere Michael S Arnold Jr Dr Psychtrst Residence" in it. You can hardly use it as a mailing list in that format. No problem, ParseRat will change it to "Dr Michael S Arnold de la Mere Jr" (and if you want, it will even leave off the "Jr" or export the name in fields as "Dr","Michael","S Arnold","de la Mere","Jr").

  • Genderize names, creating default male and female titles from first names. Effective v 1.0i 10,000 names are pre-programmed and you can import 10,000 more.
  • Analyse street address information. Parse out into suite designator, suite number, street number, street name, street designator, street direction, odd/even and put non-civic address elements into a separate field. Produce a "standardised" address if required.
  • Analyse City address line, parsing out into City, State/Province, Postal/Zip and substituting approved abbreviations. Recognise US, Canadian, British and most European postal code formats.
  • Change the case of fields on output, UPPER CASE, lower case, Proper Case (correctly handling capitalisation exceptions like McDonald or WordPerfect).
  • Generate Soundex codes.
  • Eliminate duplicate records (and non-identical records containing effectively duplicate data) based on individual field analysis.

Write out to new format and structure Delimited lines.

  • Fixed format lines.
  • dBase table.
  • WordPerfect merge file.
  • With or without field name headers (e.g. for Microsoft Word mail merge).
  • Command Line Interface permits calls from shortcuts, batchfiles or other programs needing sophisticated input parsing.
  • Interleaved Output (added with v1.0m). You may wish to use your output file as input to a mailmerge program which prints several mailing pieces on a single sheet of paper. Interleaving records permits you to maintain the same sequence as your input records after cutting up the output stack.

Guy Software, 1752 Duchess Avenue, West Vancouver, BC V7V 1P9. Tel: 604-926-1370; Fax: 604-926-1346.