The geological community is provided in this paper with a simple and powerful FORTRAN software tool GEOLITH (GEOGRAPHICAL-LITHOLOGICAL). It is created for the sole purpose of retrieving geographical, lithological, political, and bibliographical information from the international IGBADAT (IGNEOUS DATA BASE) database  , and similarly-structured databases. Users of the extracted data can use them for their research purposes.
Published information on plutonic and volcanic rocks from areas in many regions of the world abound in the literature. Most of the information appears in many well-known journals of geological disciplines and other earth science publications. Yet other information resides in less known proprietary sources, such as natural resource agencies, mining companies and industrial surveys. Graduate student theses in universities also contribute such information. Some of these sources are difficult to access and may incur costs for their release into the public domain. Petrological databases like IGBADAT permit access to such compiled information and their exploitation. For example, the KAYDER database   is a database molded along the IGBA (IGNEOUS BASE) structure and contains data of igneous rocks from Turkey. PETROS is another database on igneous rocks housed by the National Oceanographic and Atmospheric Administration (NOAA)  . These databases host a large volume of information on igneous rocks in easily-accessible database formats.
Since the middle of the 1980’s, the author embarked on a long-term campaign to develop IGBA algorithms necessary to tap the vast wealth of information housed in the IGBADAT database. The FORTRAN programs decode stratigraphic age data (STASSAGE) and chronological age data (PHASS99)   , and geographical-lithological-political-bibliographical data (GEOLITH―this study). In addition, a maintenance program―NOGAP―was developed by  . Other developers of such decoding programs assist in such decoding efforts     . These programs facilitate and enhance the visibility and popularity of the IGBADAT database and the IGBA system. They allow data mining and exploitation of the database.
2. The IGBA System
A brief description of the IGBA system is provided here so that readers become familiar with the nature of the IGBADAT database and the goal of the GEOLITH software program described in this contribution, The IGBA-related literature (e.g.     ) gives a detailed explanation of the IGBA system.
The IGBA system of igneous rocks information is an international repository of published information on igneous rocks  . It has four components: the IGBADAT database, a companion bibliography file, a collection of FORTRAN retrieval and decoding programs, and a DECIGB library. Several maintenance programs also form parts of the IGBA system. Details of the various components are described fully in the IGBA literature     . The IGBA system possesses unique and proprietary structure, syntax, grammar, and vocabulary   . It utilizes pneumonic representations for many of the data types, and alphanumeric text strings manipulations for the rest of the data.
Both the IGBADAT database and the companion bibliography files are ASCII text files. All information strings in IGBADAT file, except in the “C” block (section below), are structured according to the IGBA rules. The length of the records is fixed (80 characters). Some of the strings in the “C” block have variable lengths.
Readers are referred to    for information on the proprietary syntax, vocabulary, and grammar. In the 1990s, the IGBA system migrated to the INTERNET platform   .
The IGBA petrological database system was originally developed in the Carnegie Institution of Washington, D.C.  , and formally published in  . Its history is described in    , which describe the design and generation of the database.
The IGBADAT database is supervised and maintained by the International Union of Geological Sciences, Commission on the Systematics in Petrology (Sub-commission on Databases in Petrology). The entire IGBA system resides on the National Geophysical Data Center (World Data Center A) in Boulder, Colorado, USA.
The current version of the IGBADAT database contains more than 25,000 descriptions of plutonic and volcanic rocks and more than 3 million records.
2.1. The IGBADAT Database and the Bibliography Data File
2.1.1. The IGBADAT Database
The IGBADAT database  is the IGBA component around which revolves all the activities of the IGBA database system. The database is a sample-based information bank containing geographical, lithological, political, petrological, mineralogical, geochemical, geochronological, and stratigraphical information. The database also contains much ancillary information.
The IGBADAT database and all IGBA files, are ASCII text files managed by all text editors, such as the NOTEPAD editor. Details of the IGBADAT database are described in    . Only geographical, lithological, political, and bibliographical data form targets for identification and processing by the GEOLITH software tool.
Each rock specimen description in the database is composed of three records (A, B, and C). Records A and B are one line each. Record C is a block of lines, composed of up to 24 lines (C to Z). All record lines are 80 alphanumeric characters long.
Records A and B contain pieces of information in the form of fixed-length strings, like the geographical locations. Record C, on the other hand, is an information block containing, among other information, variable-length strings. Additional information tags are stored in the C block. The GEOLITH program uses the additional information tags (XL) and (XP) to extract from the database the geographical locations and the political information, respectively.
2.1.2. The Bibliography File
Each IGBA database file (e.g. IGBADAT) must have a companion bibliography data file. Absence of such file is detrimental to the operation of the decoding programs. The bibliography file contains all the literature source references of published information contained in the database file. Each reference in the database shows as one or more lines of record. All lines of a reference contain in the first five positions a unique 5-digit number ranging from 1 to 99999. All lines of a reference record must contain the same identifier number. The identifier reference numbers for different references need not be sequential.
The bibliography file is created according to the proprietary rules of the IGBA system  . During the creation of a bibliography file, a two-slash string (//) is automatically inserted between the three conventional elements of a reference, author, title, and source. The string also appears at the end of a reference to signal its completion. The GEOLITH program utilizes these “//” strings to parse the reference string in the database and to generate the name, title, and source strings for output.
2.2. Retrieval and Decoding Programs
Each type of information in the IGBADAT database has its specific extraction and decoding routine program. Each of these programs interfaces with the DECIGB (DECODE IGBA Library) (below) to access the corresponding library sections needed for interrogation of the database, and selection only of the requisite information. These retrieval programs operate individually in an isolated manner.
2.3. The DECIGB Library
The DECIGB library   is an assortment of FORTRAN subroutines that are called by the retrieval programs in the previous section. This library functions in two fashions. It can be inserted in each of the retrieval programs to generate bundled and robust programs, or it compiles separately to produce an external executable module. The module links to each of the retrieval programs individually. The GEOLITH program borrows a few of the DECIGB library subroutines that are pertinent for its operation. The geographical location subroutines LCNM and LCNF are examples of such routines. The GEOLITH program inserts these subroutines in its ensemble of subroutines and compiles them together to create a stand-alone GEOLITH software application.
2.4. Maintenance Programs
The IGBA system maintenance programs are stand-alone software applications that each performs a task. For example, one program creates IGBA data files (e.g. IGBADAT) according to the IGBA structure and rules. Another program creates the bibliography files. Yet another program checks data files for accuracy and verifies their adherence to the IGBA proprietary formats. For example, NOGAP  is a program to remove redundant space between additional information tags in the C block of specimen descriptions. Other programs perform miscellaneous maintenance IGBA operations.
3. The GEOLITH Software Program
3.1. General Description
The GEOLITH software tool is a program written in the FORTRAN language. It is simple, user friendly, and does not require computer programming knowledge by users. It is a medium-sized algorithm. Below is a description of the various types of data retrieved from the IGBADAT database by the GEOLITH software program. Other types of information in the database decode by other programs   . The GEOLITH software program source code is in Supplementary Material Q. The executable version GEOLITH exists in Supplementary Material P.
The present version of the GEOLITH software tool was designed, written, compiled, and optimized under the LAHEY-FUJITSU FORTAN 95 Compiler environment (SHASTA) Version 7.60.02. It underwent a repeated and extensive testing to improve it and verify its robustness and accuracy during information decoding. It works on the WINDOWS platform.
A description of the types of information extracted and decoded by the GEOLITH software application follows. When subjected to the GEOLITH execution, the IGBADAT database (and similarly-structured files) yield geographical, lithological, political, and bibliographical information, and other types of ancillary information.
3.1.1. Geographical Information
The geographical information extracted by the GEOLITH program are locational coordinates expressed as north-south and east-west positions. Each sample in the database has a unique location position placed in the first line of each sample description. An example is 21367N 39373E, which decodes to 21.367N and 39.373E.
In addition to the Cartesian coordinates above, an optional cultural location is produced. A cultural location, if detected in a specimen description, is included in the output. It appears as an additional information tag “XL” in the “C” block of the sample description. A null output is retrieved if this tag is lacking or empty in the sample description. An example is “50 meters from the intersection of Highway 301 and Glendale Avenue”.
3.1.2. Lithological Information
A variety of lithological information types shows as text strings in the IGBADAT database. Information appears as rock groups and rock samples within groups. Each rock group contains one or more samples. Input files may contain more than one rock group.
Rock Group Name
A rock group name is identified by a 3-letter name located in the first three positions of the first line of each rock group. Example is “CJL”.
System Specimen Name
The system specimen name of a rock specimen contains the 3-letter rock group name to which it belongs, and an appended 1-letter in position 5 of the first line of the sample description. All samples belonging to a rock group carry the same rock group name designation. An example is “CJL A”. Records within each sample appear sequentially (A, B, C.) in succeeding sample lines.
Field Rock Name
Each sample has a nominal field rock name and a proper rock name     . An example of this dual naming may be “siliceous” and “granite” or “mafic lava” and “basalt”. The position of the nominal field rock name is immediately after the location coordinates in the first line of each sample. The proper rock name, on the other hand, appears at the end of the second line for each sample (i.e. the major-chemistry line). It is indicated by a 4-digit character string.
Rock Group Title
Each rock group has its title. It is a character string occupying positions 7 to 80 of the first line of each rock group identifier record. An example of a rock group title is “Southern Sinai Granitoids”.
A geologic unit is a collection of related rock specimens forming a subgroup within a rock group. A rock group in the IGBADAT database can potentially include more than one geological unit or rock formation. Each geological unit or rock formation contains one or more samples of the same rock group. The position of a geological unit or rock formation is the last string in the first line of a sample declaration. A geologic unit in the rock group above may be “Wadi Kid Granite-Sinai”.
3.1.3. Political Information
The political information retrieved for each sample by the GEOLITH program is a country name and a political name of a province in the country. The pointer for this information in the database is in the additional information tag “XP” in the “C” block of the specimen description. This information is optional. This tag, if recognized in the sample description, decodes to a proper country name and province in that country, according to the FIPS (Federal Information Processing Standard Publication) code  . Examples of this dual political naming, is “EG” for Egypt and “US56” for the state of Wyoming in the USA. The pneumonic symbols for political entities are 4-character strings like “SA” or “VM37”.
3.1.4. Bibliographical Information
Each sample in the IGBADAT database has associated with it a literature source reference identifier. The identifier is represented by a 2-digit number pointer which leads to the position of the source reference in the reference vector line in the second line of each rock group. The GEOLITH program employs this pneumonic number to search the bibliography file for the correct source of information for a rock sample. The bibliography source reference for each sample is filtered from the bibliography file. Once selected, the reference is parsed into three strings: author, title, and source of publication.
Information in the IGBADAT database is extracted from sources in the literature by many contributors. The contributors feed the data manually into IGBA data forms   . After verifying data in the forms  , contributors transfer such data electronically into digital input text files in computers. The name of a contributor appears in the database in the second line of each rock group, immediately after the nominal regional location coordinates.
4. Spreadsheet Capability
An exceptionally beneficial feature of the GEOLITH software is its ability to generate a space-delimited output files, which are ready to be exported to spreadsheets. This ability permits users to take advantages of spreadsheet’s powerful capabilities. The capabilities include numerical, statistical, and graphical aids. The creation of publication-ready plots and figures is paramount among these facilities.
5. Flow Chart of the GEOLITH Software
Figure 1 is a flow chart describing the sequential steps followed by the GEOLITH program. It details the cascading processes followed during the simultaneous processing of the IGBADAT database file (or similarity-structured files), and the companion bibliography file. Sections “README” and “INSTRUCION MANUAL” (Appendix A and Appendix B, respectively) should be consulted when analyzing the flow chart; they render the explanation and understanding of the GEOLITH software application easy and straightforward.
6. Examples of GEOLITH Files
Several testing file examples constitute the input file ensemble for testing the GEOLITH software application. Three test input data files from this ensemble were chosen for testing in this paper, along with a bibliography file. The bibliography file is applicable to these three input data files. The input data files are LITHTEST.015, PERFECT.010, and FATAL.050, and the associated bibliography file is A16.T. These files appear in Supplementary Materials A, B, C, and Q, respectively. To illustrate the diversity of data handled by the GEOLITH program, some fake information was intentionally inserted into the input data files. The error files display such permutations of information. The only input information required from users is the name of the input data file and the name of an associated bibliography file.
File LITHTEST.015 (Supplementary Material A) is an input file containing missing data in some of the positions in the input data file reserved for some of the extracted information. Some information was intentionally deleted from the
Figure 1. A flow chart for the program GEOLITH. It shows the sequential process of information extraction and decoding by the software tool.
file. Users will logically experience the effectiveness of the GEOLITH software tool in detection and identification of missing data.
File PERFECT.010 (Supplementary Material B) is an input file without missing data. This file is complete and robust, containing values for all the information targeted during interrogation of input data files by the GEOLITH program.
File FATAL.050 (Supplementary Material C) is an input file having fatal errors that abort the execution of the GEOLITH program. Errors show in the error file as messages pointing to the line in the data file causing the termination of execution. The error file also explains the nature of the discrepancy. Users can correct the faults and re-execute the GEOLITH program. The FATAL.LTH output file for such input file will show only the banner of the GEOLIH program. The spreadsheet-compatible file will be empty. A spreadsheet will not be created for such input files.
The corresponding output files for each of the input files should be compared to view the different capabilities of the GEOLITH software tool.
7. GEOLITH Results
This section describes the output files generated from the application of the GEOLITH software program to the input files in the previous section. These output files are internally generated by the GEOLITH tool and transparently assigned and named by the software. The output file names are the same as the root name of the input files. Extensions “LTH”, “SPH”, and “ERR” are automatically attached to the output file names. The name for the spreadsheet file and extension are assigned by the spreadsheet application (e.g. EXCEL).
7.1. File with Missing Data
Frequently, users of the GEOLITH software tool encounter situations in which some information is lacking in input files. Execution of the GEOLITH program in these cases continues until program completion. Error files explain the nature of the missing data and indicate their location in the database. The input file LITHTEST.015 is an example of such input files. Output from the input file is three files. The first output file LITHTEST.LTH (Supplementary Material D) contains retrieved and decoded information in the ASCII text format. The second output file LITHTEST.SPH (Supplementary Material E) displays the same results as in the file LITHTEST.LTH, but casted along the structure of generic, space-delimited spreadsheets. It is ready to be carried over and pasted into common spreadsheets like EXCEL. The third output file is the error file LITHTEST.ERR (Supplementary Material F) which shows diagnostic and error messages generated by the GEOLITH program. The fourth file is an EXCEL spreadsheet file LITHTEST.XLSX showing all the data decoded by the GEOLITH program (Supplementary Material G).
7.2. File with Complete Data
Input files which do not lack information in any searched locations in IGBA databases produce the most complete and comprehensive results. Output error files from these input files are empty. A good example of such input files is file PERFECT.010. Output files from this input file are three. The first output file PERFECT.LTH (Supplementary Material H) contains retrieved and decoded information in the ASCII text format. The second output file PERFECT. SPH (Supplementary Material I) includes the same results as in the file PERFECT.LTH, but casted along the structure of generic spreadsheets. Information in this second file is ready to be blocked, copied, and pasted into common spreadsheets like EXCEL. The third output file is the error file PERFECT.ERR (Supplementary Material J). No error messages appear in this error file. The fourth file is an EXCEL spreadsheet file PERFECT.XLSX showing all the data extracted by the GEOLITH program in a spreadsheet format (Supplementary Material K).
7.3. Files with Fatal Errors
Input files sometimes have faulty information, misallocated data or other trouble sources, leading to the fatal termination of the execution of the GEOLITH software. Such severe circumstances result into aborted jobs. All data decoded prior to the abortion of jobs are written to the output files. The input file FATAL.050 is an example of faulty files. Output files from this input file are three. The first output file FATAL.LTH (Supplementary Material L) contains the retrieved and decoded information in the ASCII text format. The second output file FATAL.SPH (Supplementary Material M) shows the same output data as in the file FATAL.LTH, but organized according to the structure of common generic spreadsheets. The third output file is the error file FATAL.ERR (Supplementary Material N) which displays diagnostic and error messages generated by the GEOLITH program. Output files show all extracted data before abortion.
The GEOLITH software tool presented in this contribution is a decoding program designed specifically to produce geographical, lithological, political, and bibliographical information from the international IGBADAT database  . The database is a global data bank housing a multitude of information on plutonic and volcanic rocks from many countries. The geographical information is represented by the positional coordinates of each sample in the IGBADAT database. The lithological information is described by several attributes. Sample field name and proper rock name   form the most important identifier of samples. In addition, samples are tagged by names of rock groups and geological unit to which they belong. Political information of samples decodes to names of countries and provinces within countries. The bibliographical information returned by the GEOLITH program is the name of author(s), the title of reference, and the journal specifications. The GOELITH tool produces three output files. One of the files is spreadsheet-compatible and is ready to export to spreadsheets like EXCEL.
Regrettably, the IGBADAT database is not known by many researchers. A major reason for the database obscurity is due to the lack of computer programs capable of extracting and decoding of information stored in it.  discuss the potentialities of the IGBADAT database and comments on its neglect by the petrological community. Prior to the GEOLITH appearing on the scene, the geographical, lithological, political, and bibliographical information were mostly inaccessible and hardly amenable to decoding in a simple, flexible, and fast manner. The decoding programs developed by   , and this study and others  retrieve and decode this information. Such decoding programs provide users with the required tools to interface with the IGBADAT database in a simple and easy fashion.
Design and development of programs to obtain information from the IGBADAT database lead to popularization of the database among the petrological and databases community, by simplifying the process of data acquisition. Besides the GEOLITH software primary purpose of production of voluminous information needed by researchers, the software is fast and simple during its execution. This simplicity adds to the increased visibility and popularity of the IGBADAT database. Every effort has been taken to provide the IGBADAT and earth science community with an efficient and error-free software tool.
The GEOLITH program, along with the STASSAGE and the PHASS99 programs   , extract and decode much of the information in the IGBADAT database. Other types of information will be targeted for decoding by this author in future decoding programs. For example, a program is under planning to secure geochemical information from the IGBADAT database; major elements oxides and minor elements contents of igneous rocks will be the subject of a future decoding application program.
I present in this paper the GEOLITH software application for retrieval of geographical, lithological, political, and bibliographical information from the IGBADAT international igneous rocks database. The program searches for these information in the database, extracts and decodes them, and translates them to English. Sought-after information is stored in the database as alphanumeric character text strings. Pneumonic strings are utilized heavily to represent much of the data in the IGBADAT database. Input for the GEOLITH program is an IGBADAT-style data file and a related bibliography file. Output is three files, one of which is exportable to a spreadsheet like EXCEL. Users of the GEOLITH software tool obtain much information that is germane to their research. Interface with the program is simple, straightforward, and transparent. Users do not need a programming knowledge to use the GEOLITH software tool.
This research was self-supported. The manuscript was edited by Professor Michael Duane and Dr. Linda Reinink-Smith.
Supplementary Materials related to this paper are composed of several files. Below is a list of the supplementary materials:
Supplementary Material A containing an example input data file (LITHTEST.015) with missing data.
Supplementary Material B containing the input data file (PERFECT.010) without missing data.
Supplementary Material C containing the input data file (FATAL.050) with fatal errors.
Supplementary Material D containing the output file LITHTEST.LTH.
Supplementary Material E containing the spreadsheet-compatible file LITHTEST.SPH. https://www.dropbox.com/s/jelneooh4qd89m3/LITHTEST.SPH?dl=0
Supplementary Material F containing the error file LITHTEST.ERR.
Supplementary Material G containing the spreadsheet file LITHTEST.XLSX.
Supplementary Material H containing the output file PERFEC.LTH.
Supplementary Material I containing the spreadsheet-compatible file PERFECT SPH.
Supplementary Material J containing the error file PERFECT ERR.
Supplementary Material K containing the spreadsheet file (PERFECT.XLSX).
Supplementary Material L containing the output file FATAL.LTH. https://www.dropbox.com/s/b3de6se1ovnwg99/FATAL.LTH?dl=0
Supplementary Material M containing spreadsheet-compatible file FATAL.SPH.
Supplementary Material N containing the error file FATAL.ERR. https://www.dropbox.com/s/wocd50i61glm3we/FATAL.ERR?dl=0
Supplementary Material O showing the GEOLITH bibliography file (A16.T) accompanying input datafiles LITHTEST.015, PERFECTt,010 and FATAL.050. https://www.dropbox.com/s/wnpw8ta38kgpe3m/A16.T?dl=0
Supplementary Material Q showing the source code for the GEOLITH program. https://www.dropbox.com/s/jogs6lp97vdsmtq/GEOLITH?dl=0
Supplementary Material P containing the executable file of the program GEOLITH.exe. This is a digital file. https://www.dropbox.com/s/mk2s430367cf3ib/GEOLITH.exe?dl=0
Appendix A. GEOLITH “READ ME”
This file describes briefly the GEOLITH software program, and its rules and limitations.
Users must consult it prior to the preparation of their input files.
The GEOLITH software tool presented in this contribution provides the geological database community with a simple, owerful, and much-needed FORTRAN application. It retrieves geographical, lithological, political, and bibliographical data from the IGBADAT database. Users do not need to know programming to use the GEOLITH tool. Supplementary Material Q contains the algorithm of the GEOLITH program. The source code includes many comments, explanatory remarks, and tracking statements. Users can use these statements if they wish to modify the program to suit their purposes. The flow chart for the GEOLITH software (Figure 1) and the example input and output files in the supplementary materials should be consulted prior to assembling the files and placing information in the proper positions in the files.
Limits of the GEOLITH program are listed below:
1) The length of a string declaring the job name is ≤ 100 characters.
2) The length of a string declaring file names (including extensions) is ≤ 30 characters.
3) The maximum number of samples in an input file is 100,000 samples.
4) Prior to the GEOLITH execution, output files with extensions LTH, SPH, and ERR, and having the same name as the input root file name, must not exist in the same location in the computer targeted for the output files.
Structure of input files:
The structure of the input files is dictated by the proprietary structure rules of the IGBA system. The IGBADAT file, and files of a similar structure, will produce perfect results, if these rules are followed faithfully. Even files with missing data pieces will execute successfully, if the errors are of diagnostic status. Files with fatal errors will terminate the execution of the GEOLITH software tool.
Appendix B. GEOLITH “INSTRUCTION MANUAL”
This file explains the structure of input for the GEOLITH software program. Users must consult it prior to the preparation of their input files.
What follows are instructions for execution of the GEOLITH software application. The steps in the instructions are followed sequentially. This instruction manual ought to be used in junction with the flow chart in Figure 1.
1) Users must ensure that the executable version of the GEOLITH program (GEOLITH.exe), the IGBADAT input files, and the bibliography file reside in the same folder or subfolder in the computer, preferably on the desktop. This guarantees a simplicity of operation. Proper pointer paths must be set should the files reside on different locations (i.e. folders or subfolders).
2) Users must ascertain that files with the extensions LTH, SPH, and ERR, and attached to the root name of the input data file must not exist in the assigned folders targeted for output in the computer.
3) Launch the GEOLITH software tool by clicking its icon on the desktop screen or typing its name on the system command prompt line, and pressing the RETURN key.
4) Type the name of an IGBA input data file and the name of a companion bibliography data file. Press the RETURN key.
5) During execution, read and follow screen instructions and displays for information regarding output files.
6) Wait for the termination of the GEOLITH program. The remaining part of the instruction manual is related to the preparation of the spreadsheet file.
7) The output content of the spreadsheet-compatible. SPH-files should be highlighted, blocked and copied.
8) Launch the EXCEL spreadsheet program (or similar applications) and paste the copied data.
9) Assign a name for the spreadsheet file and save it.