- 1 Keyed File Processing
- 2 16.1 Why learn about the KEYED access method?
- 3 Btree and Btree2
- 4 16.2 How does Business Rules! do it?
- 5 16.3 Using the INDEX command
- 6 16.4 Opening an existing file for keyed access
- 7 16.5 Special Functions, Just for Keyed Processing
- 8 Opening a new file for keyed access
- 9 16.5b The easy way to do all this
- 10 16.6 KEY and SEARCH options
- 11 16.7 Sample program with keyed access for updating
- 12 16.8 Sample report program to read sequentially by key
Keyed File Processing
In previous chapters, you learned about the organization of internal files and how to process them with the sequential and relative methods of access. Most business programmers would say that the best is yet to come. Using internal files with the keyed method of access is an enormously popular and practical technique for business programming. When you finish this chapter you will be able to:
- Define keyed file processing, key field and key file.
- Use the INDEX command to create or rebuild a key file.
- OPEN a new or existing internal file for keyed processing.
- Use the READ, REWRITE, RESTORE and DELETE statements with a KEY=n$ clause.
- Use keyed processing to read a file as if sorted by the key field.
16.1 Why learn about the KEYED access method?
Sometimes called index file processing or ISAM (Indexed Sequential Access Method), and sometimes called the key-indexed access method, Business Rules! keyed access method allows you to access records directly-even when you don’t know the location (or the relative record number) of the record within the file. The Business Rules! system allows you to do this when you supply the information in the record’s key field, or the field that identifies the record.
In Business Rules! we also have Btree indexing. It is alot faster and self maintains, while ISAM has to be maintained once in awhile. Btree is the default, for Business Rules! however by specifying OPTION 5 in the wbconfig.sys you can turn it off.
In the last two chapters, you learned that the sequential method of file access is used to process records in the order they occur in the file and that the relative method of access can be used to process records in any order you want. You even saw one program, called LISTLAST, which used a mixture of the two techniques. If you already have two techniques, one for processing records in order and one for processing records out of order, you may be wondering what else you could need.
Suppose you had a file of more than 1000 automotive parts. Instead of your having to know the relative record numbers (positions within the file) of each record, you can designate certain positions within the record as the key field. In this case, the key field would be the positions in the record where part number is located. Any part number that you want can be accessed just by using the part number as the value of the key field.
Let’s look at another example. Suppose you had a file called STATES with 50 records, one for each state in the union. Perhaps the file contains sales records for each state, or perhaps population and other statistics about the state-it really doesn’t matter for this example. Now suppose that you want to look up something for New York.
How do you know which one of the 50 records is the record for New York? If the information in the field had been entered in alphabetical order using the 2-letter state abbreviations of the U.S. Post Office, then New York’s relative record number would be its position in the alphabetized list of state codes. That sounds good, but does anybody know which state is 27th on the list? Do you know the position number for New York? It’s 32!
The easier way to keep track of the record numbers is to let the Business Rules! system do it for you with the keyed access method of file processing. This way you could just type in NY for New York, and Business Rules! would find the record for you.
The keyed access method would allow you to access any record in the same file by knowing just the two-letter state code and using it as the key field. To look ahead in this chapter, a statement to read part of the record for New York could look like this:
125 READ #3,USING F3,KEY=”NY”: TOTAL, COUNT
Notice the similarity between the KEY= clause in the above statement and the REC= clause that you used in the relative processing chapter. The KEY= clause specifies the contents of the key field for the record you wish to access. We will talk more about this clause in a later lesson.
The key field (also known as the key, the record key, the index or the index field) is used for building a special internal file called the key file (also known as the index file). You will learn more about this file in a later lesson but in short, its purpose is to hold a list of information that acts much like a cross-reference from the key field to the relative record number.
Btree and Btree2
Consider the following statement:
BTREE_VERIFY (filename) [ OFF ]
This statement causes BR to separately audit the structure of the current index branch after each operation that alters an index relating to the named file. Obviously some performance penalties are in effect when processing files in diagnostic mode. However, it can pinpoint what update operations are failing when failures occur.
Also, the INDEX command has another keyword, VERIFY, that can be used in place of REPLACE or REORG. If used with a BTREE2 file, the index will be audited instead of being rebuilt. Furthermore, REORG, in relation to BTREE2 INDEX commands, means "audit and if needed rebuild."
To activate this feature, set OPTION 22 in your WBCONFIG.SYS file. The system will automatically create new indexes in BTREE2 format and will keep track of which files utilize BTREE2 indexes. Note that BR does not permit mixing Btree index types within related files (multiple opens of the same master file with different indexes).
OPTION 5: default to ISAM, Default index type = 4; 7=btree, 4=isam
OPTION 22: The system will automatically create new indexes in BTREE2 format and will keep track of which files utilize BTREE2 indexes. It will audit and if needed rebuild them.
Quick Quiz 16.1
1. All of the following are used interchangeably except:
a) Keyed method of access.
b) Index file processing.
c) ISAM (Indexed Sequential Access Method).
d) Key-indexed access method.
e) Unlocked record reading.
2. All of the following are used interchangeably except:
a) Key file.
b) Index file.
c) Address out file.
3. All of the following are used interchangeably except:
a) Key field.
b) Index field.
c) Record key.
f) Record Number.
4. Which statement is not true about the key field:
a) It must be included in the record.
b) It is used to locate records in keyed file processing.
c) It is defined as the positions in a record used to identify that record.
d) It is used for building a special internal file called the key file.
e) None of the above.
16.2 How does Business Rules! do it?
Business Rules! keyed method of file access uses two fields working together. The internal field containing the complete records is called the master file. Files such as CHECKBOOK.INT and ACCOUNTS.INT are typical of master fields. The second file is the key file or index file. This file acts like a table which helps the system to look up specified records rapidly based on the value of the key field. The index file (or key file) is created with the INDEX command, or when the master file is created.
An index file is a lot like an index found at the back of a book. In this analogy, the book is like the master file because it is complete by itself and contains all the information. The index of a book contains an alphabetical list of things you are tying to look up, and each item is followed by the page number where you can find it. Like the page numbers, relative record numbers found in the index file also tell you where to go to find what you are looking for.
Each record in the key file consists of a copy of the key field, followed by the relative record number for the corresponding record in the master file. The key field is defined by its starting position (called key starting position) and field length (called key field length) in the master file.
What are some rules for choosing a key field? The maximum length for a key field is 128 characters. Usually the key field is one of the fields from the record layout, for example, the state abbreviation. However, a key field could be two or more fields from the record layout taken together, perhaps the state abbreviation and population. Key fields can be adjacent or split. A key field could also be all of one field and part of another field. The only restrictions are 128 characters maximum.
Here is an example of how a master file might look that contains seven states where the Excelsior Widget Co., Inc. did business last year. Let’s assume that positions 1 and 2 contain the key field of state code. For this example, it does not matter what is in the remainder of the record.
|Relative Record Number||Key Field|
As you can see there is no particular order to these records within the master file. The INDEX command builds a key file and arranges the records in the key file in alphabetical order. Let’s add the records for the key file as the third column.
|Relative Record Number||Key Field||Record In key File|
|1||VA||AL / 3|
|2||WV||AR / 4|
|3||AL||KY / 6|
|4||AR||LA / 7|
|5||MO||MO / 5|
|6||KY||VA / 1|
|7||LA||WV / 2|
Each record in the key file contains the key field and a relative record number. Key file records are sorted in alphabetical order from the lowest letters to the highest (ascending order). Look at the first entry in the third column, where you will find AL/3. This means that the record for AL (Alabama) is the first key field in the alphabetized sequence; the 3 means that it is found in the third record in the master file. Similarly, the seventh entry in the third column is WV/2 which indicates that the last record in alphabetical sequence in the master file is record 2 which contains the key field WV (West Virginia).
Finding a particular record is fast because the keys are arranged in alphabetical sequence in the index file. To find the record for LA in the master file, Business Rules! searches this small efficient table. When it locates the record for LA, that same record contains the information that LA is relative record number 7 in the master file.
Just like finding the page number in the index of a book, the system can now go to that record number and find the desired information. The program, the programmer, and the operator never need to know that it is record 7; Business Rules! keeps track of all the details and finds the record that you want.
However, the efficiency of keyed access can be slowed down when more records are added without periodically rebuilding the index. For example, suppose that 10 more states are added to the master file. Later you will learn how to add records to both the master file and key file simultaneously, so that the new file could look like this.
|Relative Record Number||Key Field||Record In key File|
|1||VA||AL / 3|
|2||WV||AR / 4|
|3||AL||KY / 6|
|4||AR||LA / 7|
|5||MO||MO / 5|
|6||KY||VA / 1|
|7||LA||WV / 2|
It is not necessary that you understand all the details about what is in a key file and how the different parts are searched. The main point is that the keyed access method can be very fast when the index file is well organized. As more new records are added, the index file gets more disorganized. So, it is important to use the INDEX or REORGANIZE commands every once in a while to keep things as speedy as possible. In the next section, you will learn the “how” part of using these commands; in this section, you should now know the “why” part, the importance of periodically rebuilding index files.
What is the defining characteristic of the keyed access method? In summary, the keyed access method allows you to access a record independent of other records by specifying its contents using a programmer-designated key field.
Techniques for keyed access also have advantages over techniques for relative access when the STATES file has to add new records (e.g., for new sales in the District of Columbia or Puerto Rico) in alphabetical order. With relative processing, there is no way to insert a record for DV in between record 7 for CT and record 8 for DE; all record numbers after 7 would have to be re-assigned. No programming changes or operator retraining is necessary when DC is added to programs using keyed access.
Keyed access would also work the same way if the STATES file contained fewer than 50 states. The file could start out with only 1 state and grow as needed. There never has to be any space for unused records in the file (as is often necessary with relative access when records will be added or deleted). No keyed access programs would need to be changed. When records are deleted, no programming changes are needed. The only system overhead, which is recommended but not required, is to rebuild the index files after many new records have been added to speed up system performance.
Quick Quiz 16.2
True or False:
1. In keyed processing, the two files which must work together are the master file and the key file.
2. The key field is defined by its starting position and its length.
3. A key field can be up to 240 characters long
4. All positions in a key field must be next to each other.
16.3 Using the INDEX command
The following is a list of what you need to begin using the keyed method of access:
- An internal file.
- A record layout for the file.
- A key starting position and key field length determined form the record layout.
- A key file, usually built from the internal file using the INDEX command or created when the master file has been created.
- A new variation of the OPEN statement which specifies the keyword KEYED and also the name and location of the key file when creating the master file.
- A little knowledge about how to add the KEY=n$ clause to the READ, REWRITE, DELETE and RESTORE statements.
By now you should understand the internal file and record layout concepts. This lesson shows you how to create a key file by specifying the key starting position and the key field length.
The next two lessons deal with the use of the OPEN statement with KEYED, and the remainder of the chapter then deals with the KEY=n$ clause.
The INDEX command creates a new key file or rebuilds an existing key file. After executing the INDEX command, all entries are in the primary (organized) area, with none in the overflow (unorganized) area; this means all entries are sorted in ascending order, and the keyed access method can run at maximum efficiency.
The main thing that you need to know to use the INDEX command is what the parameters are which appear after the keyword INDEX. There is one additional concept that should be discussed first. Business Rules!’s keyed access method allows duplicate keys. A duplicate key is a key field that is identical to the key field for one or more other records in the file.
In many cases, duplicate keys are not desirable. Often, the key field is chosen to be unique for each record. In the STATES file, it would be a mess if several records had a TX or MN in the key field. In other cases, a second index file can be created using a field like customer last name or a field containing a date; in these situations, it seems likely that sooner or later the same name or date will be repeated, and duplicate keys will exist.
When duplicate keys occur, the INDEX command will always build a key file. There is an option which determines whether the operator receives a warning message (error code 7603) that duplicate keys were found. There are other options to allow listing of the duplicate key fields along with their record number. This listing may be sent to the screen, the printer, or a file. Here is an example of a file indexed by state code to show what the duplicate keys output is like. Key fields are in the left column and relative record numbers are in the right column.
An important potential problem that you must be aware of is that when duplicate keys are allowed, it is the responsibility of the programmer to make sure that updates are applied to the correct record. If Mr. Jones pays $25 on his charge account, and the key field is last name, the programmer must provide some way to get the $25 to the correct Jones. Later in this chapter, there will be an example of processing a file with duplicate keys.
Now that you understand what is meant by duplicate keys, let’s examine the INDEX command. Some of the information is required, and some is optional.
Let’s begin with the four required parameters.
1 ) Master file-ref the name and path of the master file for the index you wish to create.
2 ) Index file-ref the name and path of the key file to be built.
3 ) Key starting position the position within the records of the master file where the first character of the key field is found.
4 ) Key length the number of characters in the key field.
As an example of an INDEX command, using only these four required parameters, let’s consider creating a new key file for the internal file CUSTOMER.INT in subdirectory ARDATA on drive C:. Let’s assume that the key field is customer number which starts in position 1 and is 5 characters long. The command to create this key file would be:
INDEX C:CUSTOMER.INT/ARDATA C:CUSTOMER.KEY/ARDATA 1 5
As promised, this tutorial has an arbitrary convention of using .KEY as an extension on all key files. The above example assumes that the master file is found in a directory called ARDATA which is a subdirectory one level below the current default directory on drive C:. Many programmers try to keep the master file and the key file together in the same directory. By keeping both files in the same directory, and by using the same file name with a different extension, both files could be listed with the DIR command:
Now let’s examine the optional parameters of the INDEX command.
Work path - A path (drive and directory) where a work file can be created if needed. The indexing process requires availability of a temporary “scratch-pad” workspace. If this optional parameter is not specified, the Business Rules! default is to use the default drive and directory for this workspace.
REPLACE - If there is a file with the same name and path as specified for the key file, the keyword REPLACE indicates the new file should replace the existing file. When REPLACE is not specified, Business Rules! expects to create a new file.
DUPKEYS - If you decide to allow duplicate keys in the key file, you can specify DUPKEYS to prevent the warning message (7603) from appearing on the screen. If DUPKEYS is not specified, the default is that there will be a warning message if any duplicate keys are found. The key file will always be created, with or without this optional parameter.
LISTDUPKEYS - As the key file is being created, LISTDUPKEYS lists all duplicate keys on the screen. Both the value of the key field and the relative record number are displayed. With this information, you can then decide whether to leave the duplicates as they are or to change them.
<File-ref - This optional parameter sends LISTDUPKEYS information to a file specified by the file-ref. Programs can use input from this file to report on or delete the duplicate keys. This file is a display file; the differences between display files and internal files will be discussed in the next chapter. Specifying >PRN: would send the LISTDUPKEYS output to the printer.
Of these five optional parameters, the most frequently used is probably REPLACE.
If REPLACE is not coded, and if the key file already exists, error code 7611 will occur and the new key file will not be built. However, if REPLACE is coded, and if the key file does not exist, there is no error. Since coding REPLACE can prevent an error, and never produces an unwanted error, a practical rule is to always specify the REPLACE parameter.
It is typical to code REPLACE, a typical INDEX command would look like this:
INDEX CUSTMAST.INT CUSTMAST.KEY 1 7 REPLACE
When duplicate keys are expected, the optional DUPKEYS parameter is used to prevent a warning message to the operator.
When duplicate keys are not expected, it probably means the file has been messed up and needs to be repaired by a programmer. The following INDEX command makes sure that any unexpected duplicate keys are listed to the printer, so that they can be examined later.
INDEX ACCT.INT ACCT.KEY 10 5 REPLACE LISTDUPKEYS >PRN:
Quick Quiz 16.3
True or False:
1. A command which creates new key files or rebuilds existing key files is called the INDEX command.
2. A key field must be chosen so that it is like a unique finger print to identify each record; no two fingerprints or key fields may ever be the same.
3. One situation where a duplicate key could be created is when the exact same record is entered twice in a file.
4. Immediately after an INDEX command, the key file is arranged for maximum efficiency.
5. The four required parameters of the INDEX command are master file-ref, index file-ref, key starting position, key length and REPLACE.
6. A key field is the same as a keyword.
16.4 Opening an existing file for keyed access
As usual, the first step in reading information from a file or writing information to a file is to open the file. There are only two changes for opening a file for keyed access instead of sequential or relative access. First, the file identification string must include the names of two files. In addition to the usual NAME= clause, you must add a KFNAME= clause to identify the name and location of the key file.
The second change is that you must specify the keyword KEYED instead of the keywords SEQUENTIAL or RELATIVE for method of access. For example, the following statement would allow the program to be able to read a master file called ACCT.INT using a key file called ACCT.KEY:
500 OPEN #5:”NAME=ACCT.INT,KFNAME=ACCT.KEY”,INTERNAL,INPUT,KEYED
The parts above are: Line number, File Number, Master File Name, Key File Name, File Type, Type of Use, and Method of Access
Let’s briefly review the parts of this OPEN statement. The file number is an arbitrary number from 1 to 127 and serves as a short-cut for the file name. The file identification string includes the name of the master file in the NAME= clause and the name of the key file in the KFNAME= clause. This string is the only place that the name of the file or key file is mentioned in the program. The next method of access can only be used with internal files, the keyword INTERNAL appears next. Similarly, in order to use the keyed method of access, the keyword KEYED appears at the end of this OPEN statement. The keyword INPUT as the next to last parameter was chosen arbitrarily just to fill that position in this example.
Quick Quiz 16.4
Examine the following program. Enter it into Business Rules! as you go to help you debug it.
10 OPEN #63: “NAME=CLASS.INT”,INTERNAL,INPUT,SEQUENTIAL 20 READ #63,USING 30,KEY=”121212”: ZIPCODE$ 30 FORM POS 7,C 5 40 PRINT 40;ZIPCODE$ 50 PRINT 50;KPS(63);KLN(63)
1. When you type RUN, what error code occurs on line 20?
2. What keyword should you change in line 10 to eliminate the error in line 20?
3. What key length should you specify when creating a key file for use in this program?
4. Assuming that the key field starts in position 1, write and execute a command to build an index file.
5. Modify line 10 to access the key file that you created, then type RUN. What value is printed for ZIPCODE from line 40?
6. What two numbers are printed in line 50? Explain how they match up with the INDEX command that you typed.
16.5 Special Functions, Just for Keyed Processing
From the many built-in functions in Business Rules!, the last quiz introduced two functions that are especially helpful in programs using the keyed file access.
These functions need only one argument, a file number. When N is a file number,
- PRINT KPS(N)
Will print the key starting position for file number N.
- PRINT KLN(N)
Will print the key length for file number N.
If file number N is not open, or is not open for keyed access, these functions will return -1. Besides using them in print statements (perhaps for debugging), these functions can be used anywhere that numeric expressions can be used.
Opening a new file for keyed access
Do you remember what it means when you see the presence of the RECL=. clause in the file identification string? It means that a new file is to be created. That’s still true for the master file, but how do you create a key file? Like many other great questions in life, the answer is “it depends!”
If the master file already exists, the way to create a key file is with the INDEX command. What if you want to create a new master file and a new key file at the same time? The answer is using the two new clauses in the file identification string of the OPEN statement.
Here’s an example of an OPEN statement that creates a new master file called ACCT.INT and a new key file called ACCT.KEY.
10 OPEN #1: ”NAME=ACCT.INT,RECL=31,KFNAME=ACCT.KEY,KPS=1,KLN=4”,INTERNAL,OUTIN,KEYED
RECL=31 specifies that the new records will all be 31 bytes long.
The KFNAME= clause in line 10 specifies the name of the key file to be created. Just as the presence of RECL= clauses indicates a new master file, the presence of KPS= and KLN= clauses indicates that this file is new also. As you probably have guessed, the KPS=1 in line 10 means the key starting position is position 1 in the record in the master file. Similarly, the key length is 4. In other words, the key field is 4 bytes long and starts in position 1.
What is the effect of using an OPEN statement to create both a master file and a key file? Later in the program, when records are added to the master file, the key file will automatically be updated as well. However, all entries in the key file will be in the overflow area. You should remember to rebuild the key file at the end of your program.
Your next to last statement could be:
99000 EXECUTE “INDEX ACCT.INT ACCT.KEY 1 4 REPLACE
Let’s think about when these four clauses should be included in the OPEN statement for keyed access. Does it make sense to have a new master file and an old index file? Is it possible to have an old master file and a new index file? No and no. The only meaningful combinations are to use all four clauses (both files are new) or none of the four (both files are old).
Quick Quiz 16.5
For each OPEN statement below, indicate whether an error will occur or not.
1. 10 OPEN #1:”NAME=ACCT.INT,KFNAME=ACCT.KEY,KPS=31,KLN=4”,INTERNAL ,OUTIN,KEYED
2. 10 OPEN #1:NAME=ACCT.INT,RECL31,KFNAME=ACCT.KEY,KPS=1,KLN =4”, INTERNAL,OUTIN,KEYED
3. 10 OPEN #1:”NAME=ACCT.INT,RECL=31,KFNAME=ACCT.KEY,KPS=1, KLN=34”, INTERNAL,OUTIN,KEYED
4. 10 OPEN #1:”NAME=ACCT.INT,KFNAME=ACCT.KEY”,INTERNAL,OUTIN,KEYED
5. 10 OPEN #1:”NAME=ACCT.INT,RECL=31,KFNAME= ACCT.KEY”, INTERNAL,OUTIN,KEYED
6. 10 OPEN #1:”NAME=ACCOUNT.INT,RECL=31,KFNAME=ACCT.KEY”,KPS =1,KLN=4”, INTERNAL,OUTIN,KEYED
16.5b The easy way to do all this
About now you may understand everything, or you may be a bit confused. Keyed access makes file processing fast and more convenient. The good news is that there is an even faster and more convenient way to do it.
FileIO is a program that uses file layouts to organize and access your data files, create indexes and sort your information in a convenient full-screen format. You can also make changes to the data file and FileIO will automatically update it for you. FileIO is available online free at www.sageax.com, along with a brief tutorial on how to start with it and documentation with further details.
Download FileIO now and we’ll walk through how to set it up. It downloads in a zip file, so open it and copy and paste the FileIO folder into the location you would like to work with it from. FileIO works because it uses File Layouts. Using a text editor, create a File Layout by typing this:
chekbook.int, CH_, 0 recl=63 =============================================
This is the file layout header. The first line has three elements. The first is the file name and path: Chekbook.int and its in the current directory (copy it into the same folder as FileIO for this one). The second is a unique prefix associated with the file: CH_. FileIO uses this prefix to identify fields from the file used in your programs. The third field indicates the current version of the file; 0 in this case because it’s the first one. FileIO will update your data file when you increase the version number and it’s higher than the current version of the file.
If you don't want Fileio to update your data file, then always keep the version number 0. Otherwise, if you do want to use the automatic update feature, always change the version number as you make changes to the file. Its easy to tell what version your data file is. Simply run one of the programs that opens the Chekbook.int file, and pause the program with the file open. Then type Status Files at the prompt and it will tell you most of the information you need to make your file layout header.
The next several lines of the file layout describe the keys that the file uses, if any have been created already. Specify each key on its own line, followed by the field names that the keys are made out of. Chekbook currently has no keys, so we will use FileIO to set one up shortly.
After all your keys are specified, you need a line telling the record length of your file. This file has a record length of 63.
After that we have a row of “========,” which is ignored by FileIO. Use this row to make the file layout more readable by placing a horizontal divider between the header and the detail sections.
Now that we've gone over all the parts of the header, lets take a look at the detail section of the file layout:
chnum, Check Number, N 5 amount, Amount, N 10.5 typetran$, Type of Transaction, C 1 datew, Date Written, N 6 datec, Date Cleared, N 6 payee$, Payee, C 25 accnum, Account Number, N 8
You should recognize these elements from when you wrote the Chekbook program earlier.
Chekbook.int has 7 fields in it. Each row in the layout describes one field, using three elements each. The first column is the name of the field. This is the same name that's used to describe the keys above. This name is also used in code whenever you want to reference this field in the data file. The second element is a short description of the field. The third element is the form spec of the field, which you should remember from creating the fields earlier.
Add these fields to your layout now, or copy and paste them from this document. Double check to make sure everything is correct, especially noting the placement of commas. Save the file as chekbook with no extension in the FileLay folder, and you're ready to test it.
Open the BRNative from within your FileIO folder and you should be able to select the Chekbook file layout for viewing. FileIO will take you to a page which shows all the information from the data file.
Don’t be discouraged if you get an error message or two. Just check the layout file again and pay attention to commas, uppercase vs lowercase etc and try again.
Now that you’ve set up the data file in FileIO, you can make changes with it to access and index it. Open up the File Layout again and add a key to the header and change the 0 to a 1, since this is the next edition of your file, like this:
chekbook.int, CH_, 1 chekbook.key, chnum recl=63 =============================================
FileIO will now create the index file for you.
Now... imagine that when you entered the information into the data file, you had made a mistake, perhaps entered 10,000 instead of 1,000 for an amount of a check, or spelled a Payee’s name wrong. FileIO makes this extraordinarily easy to check and edit.
When you run FileIO, instead of selecting Chekbook and “VIEW”, press F5. This will allow you to make and save changes to the Chekbook.int data file. FileIO handles the details. Learn more about the tools available through FileIO by visiting the website again. For example, remember when we added the entries$ and entries arrays in the practice in chapter 14? Once your program is setup to work with FileIO, it can speed up the entire process and do a lot of the repetitive steps of programming for you.
16.6 KEY and SEARCH options
Next we’ll continue with KEY vocabulary, but first, a note on terminology:
In this section, KEY=n$ (with a lowercase n) will be used to indicate all four of the options on this clause. KEY=N$ (with an uppercase N) will be used to indicate this one specific option of the four.
When an internal file is opened with the KEYED parameter, several I/O statements will accept an optional clause for using the associated key file.
This optional clause is referred to as the “KEY=n$” clause. In the READ and RESTORE statements, the “KEY=n$” clause can have any of the following four forms: KEY=N$, KEY>=N$, SEARCH=N$, and SEARCH>=N$. Only KEY=N$ can be used with the DELETE and RESTORE statements. None of these options can be used with the REREAD or WRITE statements.
If duplicate keys exist, all forms of the KEY=n$ clause will return the first match, which is the record with the lowest record number (first one entered to the field) when more than one record has the matching key. If a key file is duplicated four times in the file, a READ with a KEY=n$ clause will always read the first record in the master file -- even if the READ statement is in a loop to be executed four times. More details on how to program with duplicate keys will be presented later in this chapter.
As used in this section, N$ can be any string expression. String constants, simple variables, subscripted variables, and string functions can be used. These elements may be combined with one another and used in concatenation and substring operations.
250 RESTORE #1, KEY=”121212”: 260 RESTORE #1, KEY=NAME$: 270 RESTORE #1, KEY=X$(4) 280 RESTORE #1, KEY=”12”&NAME$(3:6): 290 RESTORE #1, KEY=”12”&X&(4)(3:6): 300 RESTORE #1, KEY=LPAD$(STR$(ACCT),6): 310 RESTORE #1, KEY=CNVRT$(“PIC(ZZZZZ#)”,ACCT): 320 RESTORE #1, KEY=CNVRT$(“PD 6.3”,ACCT):
Lines 250 to 320 are examples of various types of string expressions used in the KEY=n$ clause. All these examples assume the length of the key field is 6 (that is, PRINT KLN(1) will print a 6). You should also assume the following dimensioning:
DIM NAME$*6, X$(4)*6
Lines 280 and 290 both use concatenation to join the string “12” to a substring made up of the last 4 letters (positions 3 to 6) of the variable mentioned.
Line 300 illustrates the use of string functions to take the numeric value in the variable ACCT and make it a 6-character string by padding it with blanks on the left. This is often useful because the fields must be strings; even if it was written to the file as numeric data, and even if it is being matched against a number in the current program, it must be a string when a KEY=n$ clause is being used. Business Rules! built-in STR$(X) function takes numeric value of X and makes it into a string; but the string at this point is not necessarily the same length as the key field.
Business Rules! LPAD$(A$,N) function pads the string A$ with blanks on the left until the length is N. You might think of line 300 as an equivalent to lines 301 to 303.
301 TEMP1$=STR$(ACCT) 302 TEMP2$=LPAD$(TEMP1$,6): 303 RESTORE #1, KEY=TEMP2$:
If ACCT=385, then TEMP1$=”385” and TEMP2$=”385”. Notice that LEN(TEMP1$)=3 and LEN(TEMP2$)=6.
Line 310 shows another way to make a number into a string. The difference is that there will be zeroes at the front of the string instead of blanks. Business Rules! CNVRT$(SPEC$,X) function will take any valid conversion specification used for internal files, and use it to convert X into a string. Line 320 shows that packed decimal fields can be used as key fields, however the CNVRT$ function must be used to create strings that will match the values in the file.
Now that you have seen some tricks with strings to make the KEY= clause more flexible and powerful, let’s look at the variation of this clause. The four forms of “KEY=” are constructed form the two keywords KEY and SEARCH, and two operators = and >=. These keywords and operators are explained below.
KEY indicates that the length of string n$ must be identical to the length of the key field (that is, the length you get from PRINT KLN(file-number). SEARCH indicates that the length of n$ can be less than or equal to the length of the key field; the search process looks only at enough characters to match the length of n$. = indicates n$ must be an exact match. >= indicates that the next record in the key sequence is to be used when an exact match cannot be made.
Consider the following four examples for an index file. The key field is a 6-character date field and contains the year in the first two positions:
400 RESTORE #1,KEY=”860101”: NOKEY 920 410 RESTORE #1,KEY>=”860101”: NOKEY 920 420 RESTORE #1,SEARCH=”86”: NOKEY 920 430 RESTORE #1,SEARCH>=”86”: NOKEY 920
Line 400 positions the file pointer at the first record with a key field of 860101; if no exact match exists, the NOKEY error occurs. The NOKEY error condition indicates no key found; it corresponds to error code 4272. Line 410 positions the file pointer at the first record with a key field of 860101; if no exact match exists, this statement places the pointer at the record with the next highest value in the key field, e.g., 860103.
The NOKEY error occurs in this case only when no record in the entire file has a key field with a value greater than or equal to 860101.
Line 420 moves the file pointer to the first record with 86 in the first 2 positions of the key field. If there is no match for this criterion, the NOKEY error occurs--even if records exist with 85 or 87 in these positions. Line 430 also moves the file pointer to the first record with 86 in the first two positions of the key field; if there is no match for this criterion, the statement positions the pointer at the record with the next highest value in the first two positions of the key field. The NOKEY error occurs only when no record in the entire file has a key field with a value greater than or equal to 86 in the first two positions.
Quick Quiz 16.6
True or False:
1. The REREAD statement can have a KEY=N$ clause.
2. The READ statement can have a SEARCH>=N$ clause.
3. The DELETE statement can have a SEARCH=N$ clause.
4. The RESTORE statement can have a KEY>=N$ clause.
5. The REWRITE statement can have a SEARCH=N$ clause.
6. The WRITE statement can have a KEY=N$ clause.
7. The REWRITE statement can have a KEY=N$ clause.
8. The DELETE statement can have a KEY=N$ clause
9. Any valid string expression can be substituted for N$ in the KEY=N$ clause.
10. When KEY=N$ or KEY>=N$ is specified, the length of N$ must be equal to the length of the key field in the master file; otherwise, error code 0718 occurs.
11. When SEARCH=n$ or SEARCH>=N$ is specified, the length of N$ can be equal to or less than the length of the key field in the master file.
12. >= indicates that if there is not an exact match found, the next record in key sequence will be used.
13. When KEY=”1234” is used, the key length must be less than or equal to 4; otherwise, error code 0718 occurs.
14. The NOKEY error condition means there is no match for the specified key under the matching rules specified by the choice of KEY or SEACH and the choice of = or >=.
16.7 Sample program with keyed access for updating
Let’s look at a program which uses keyed access. The UPDTCHEK program can be found in full with your supplemental programs. All except 2 sections of it are listed below. These two sections are noted in lines 700 and 8500.
After records are added to the CHEKBOOK.INT file by another program, this program allows for making changes to existing records. This process is often called updating. To aid in finding the particular record to be updated, there is an index file built using the field for check number as the key field. Although this system will not allow us to update all records in the file (for example, deposits can not be updated because they do not have a check number), it still provides a nice illustration of index file processing.
The UPDTCHEK program is typical of many keyed access programs in that it is highly interactive and uses a lot of full screen processing. The operator requests a particular record out of the file by entering some identifying code; this identifier is used as the key field when the program looks up that record. The record that is found is displayed on the screen to allow the operator to make changes.
00100 ! ****************** U P D T C H E K ********************** 00200 ! PURPOSE: Edit CHEKBOOK.INT by Check Number * 00300 ! Business Rules! 00500 ! ********************************************************* 00600 ! 00700 ! Lines 700-4900 DIM and setup arrays for full screen 04900 ! 05000 OPEN #1: “NAME=ACCOUNTS.INT”,INTERNAL,INPUT,RELATIVE 05100 OPEN#2:“NAME=CHEKBOOK.INT,KFNAME=CHECKBOOK.KEY”, INTERNAL, OUTIN,KEYED 05200 ! 05300 MAINLOOP: ! 05400 PRINT NEWPAGE ! start screen 1 05500 PRINT FIELDS PFDEFN1$: PROMPT1$ 05600 INPUT FIELDS DFDEFN1$: CHECKNUM CONV 5600 05700 IF CHECKNUM =< 0 THEN GOTO DONE 05800 READ #2,USING FORM1,KEY=LPAD$(STR$(CHECKNUM),5): AMT,DW$,CHECKDATE,CLRDATE,PAYEE$,ACCT NOKEY ERRMSG1 05900 LET CHECKDATE$ = CNVRT$(“PIC(######)”,CHECKDATE) 06000 LET CHECKDATE = VAL(CHECKDATE$(3:6)&CHECKDATE$(1:2)) 06100 LET CLRDATE$ = CNVRT$(“PIC(######)”,CLRDATE) 06200 LET CLRDATE = VAL(CLRDATE$(3:6)&CLRDATE$(1:2)) 06300 PRINT NEWPAGE ! Start screen 2 06400 PRINT FIELDS MAT PFDEFN2$: MAT PROMPT2$ 06500 PRINT FIELDS “4,20,C 13”: “Check Number:” 06600 PRINT FIELDS “4,36,N 5,u”: CHECKNUM 06700 RINPUT FIELDS MAT DFDEFN2$: CHECKDATE,CLRDATE,PAYEE$,AMT, DW$,ACCT 06800 READ #1,USING FORM2,REC=ACCT: ACCTDESC$ NOREC ERRMSG2 06900 PRINT FIELDS MAT ACCTDEFN$:”ACCOUNT”,ACCT,” = “&ACCTDESC$ 07000 IF UPRC$(DW$)<>”D” AND UPRC$(DW$)<>”W” THEN GOTO ERRMSG3 07100 PRINT FIELDS “20,20,C 33”: “Do you want to make any changes?” 07200 LET ANSWER$= “NO” 07300 RINPUT FIELDS “20,54,C 3,r”: ANSWER$ 07400 IF UPRC$(ANSWER$(1:2)) <> ‘NO’ THEN !: RINPUT FIELDS MAT DFDEFN2$: CHECKDATE,CLRDATE,PAYEE$,AMT,DW$,ACCT !: GOTO 6800 07500 LET CHECKDATE$ = CNVRT$(“PIC(######)”,CHECKDATE) 07600 LET CHECKDATE = VAL(CHECKDATE$(5:6)&CHECKDATE$(1:4)) 07700 LET CLRDATE$ = CNVRT$(“PIC(######)”,CLRDATE) 07800 LET CLRDATE = VAL(CLRDATE$(5:6)&CLRDATE$(1:4)) 07900 REWRITE #2,USING FORM1: AMT,DW$,CHECKDATE,CLRDATE,PAYEE$, ACCT !: GOTO MAINLOOP 08000 ! 08100 ERRMSG1: ! 08200 PRINT FIELDS “14,20,C 35”: “You entered an invalid check number” 08300 GOTO 5600 08400 ! 08500 ! Lines 8500 to 9900 -- 2 other error messages and error handling 10000 ! 10100 DONE: ! 10200 CLOSE #1: 10300 CLOSE #2: 10400 STOP 10500 ! 10600 FORM1: FORM POS 6,N 10.2,C 1,N 6,N 6,C 25,N 2 10700 FORM2: FORM POS 1,C 15
In line 5100, the CHEKBOOK.INT file is opened with the key file CHEKBOOK.KEY for keyed processing. In line 5600, the operator enters a check number which is used for a key in the READ statement in line 5800. In the KEY= clause in line 5800 the STR$ function is used to convert CHECKNUM to a string; then LPAD$ is used to make sure the string is a 5-character string.
KEY= in line 5800 means that there must be an exact match on all characters in the key field. If an exact match is not found, the NOKEY ERRMSG1 clause at the end of line 5800 will cause the program to branch to the line label ERRMSG1 found in line 8100. In this error routine, a message is displayed on the bottom of the screen about the invalid check number; and then the program goes back to line 5600 to accept another check number.
When a valid check number is found, lines 5900 to 7900 are executed to do lots of wonderful stuff, mostly full screen processing. In line 6700, the RINPUT FIELDS statement displays several fields for possible updating. Notice that check number is not one of the fields which can be updated (although it is displayed by lines 6500 and 6600).
The system generates an error code 0059 whenever a program attempts to change the value of the current key field in the master file, it is good practice to never let operators change the key field. If you really need to change a key field, you must delete the record, then copy the information to a new record with the new key field, then add the new record with a WRITE statement.
The changed record is rewritten in line 7900. For reasons just pointed out above, the key field is not included in the list of variables to be rewritten. Notice that there is no KEY= clause in this REWRITE statement. In this case, it is possible to use the REWRITE statement both with and without the KEY=N$ clause. Without the KEY=N$ clause, the record that will be rewritten will be the last record read.
When you want to rewrite the last record you read (as in the UPDTCHEK program), it is much faster to use a REWRITE statement without the KEY=N$ clause. The reason for this speed difference is that the index file must be searched whenever a statement has a KEY=N$ clause. This search is bypassed when the KEY=N$ clause is left out. This same reasoning applies to the DELETE statement, which is also faster without the KEY=N$ clause.
Quick Quiz 16.7
True or False:
1. The REREAD statement can have a KEY=N$ clause.
2. When KEY=”1234” is used and the key length is 4, there must be an exact match on all four characters; otherwise, the NOKEY error condition occurs.
3. A REWRITE statement without a KEY=N$ clause can be used to change the key field.
4. The REWRITE statement can be used with or without a KEY=N$ clause.
5. When you are trying to DELETE the last record read, the DELETE statement works faster without the KEY=N$ clause than with it.
16.8 Sample report program to read sequentially by key
Just as the REWRITE statement can be used with or without the KEY=N$ clause, the READ statement has these same two options. Without any type of KEY=N$ clause, the READ statement will read the next record in key sequence. The next record in key sequence is probably not the record that follows in the master file.
When processing a file using the keyed method of access, you should think of the file pointer not as being positioned in the master file, but as being positioned in the key file. Also, for this purpose, think of the key field as being perfectly organized; think of it as if index has just been rebuilt.
The RESTORE statement positions the pointer within the index file, not within the master file. The RESTORE statement also can be used with and without a KEY=n$ clause (all four forms may be used). Without a KEY=n$ clause, the file pointer is moved to the beginning of the key file.
Also, when the file is opened, the pointer is initially positioned at the beginning of the key file. If the very first READ statement in a program does not have a KEY=n$ clause, the record read will be the one with the lowest value in the key field. The next READ statement without a KEY=n$ clause will read the record with the second lowest key value, then the third lowest, etc.
The result of opening a file for keyed access and reading all records without any KEY=n$ clauses is that the records will all be read in key order, as if the file had been sorted or arranged by the values in the key field in ascending order. This technique has the special name read sequential by key. If the key field were last name, a report program reading sequentially by key would print the report in alphabetical order by last name. Let’s return to our checkbook system and look at the program RPTBYCHK listed below which reads sequentially by key. As in the UPDTCHEK program in the previous section, the key file CHEKBOOK.KEY uses the field containing check number as the key field. So, the printout from the RPTBYCHK (sounds like report by check) will be in ascending order by check number.
00100 ! ********************** R P T B Y C H K ****************** 00200 ! PURPOSE: List all checks by check number (read-seq-by-key) 00300 ! Business Rules! 00500 ! ********************************************************* 00600 ! 00700 DIM PAYEE$*25 00800 ! 00900 LET CHECKBOOK = 1 01000 LET ACCOUNTS = 2 01100 OPEN #CHECKBOOK: “NAME=CHEKBOOK.INT,KFNAME=CHEKBOOK.KEY”,INTERNAL,INPUT,KEYED 01200 OPEN #ACCOUNTS: “NAME=ACCOUNTS.INT”,INTERNAL,INPUT,RELATIVE 01300 ! 01400 PRINT #255,USING HFORM: “date”,”Check”,”Description”,”Amount”,”W/D”,”Account” 01500 PRINT #255,USING HFORM: “--------“,”-----“,”-------------------------“,”-------“,”---“,”---------------“ 01600 ! 01700 MAINLOOP: ! 01800 READ #CHECKBOOK,USING CHECKFORM: CHECK,AMOUNT,DWFLAG$,DATE, PAYEE$,ACCT EOF ALLDONE 01900 IF CHECK=0 OR DWFLAG$=”D” OR DWFLAG$=”d” THEN GOTO MAINLOOP 02000 READ #ACCOUNTS,USING ACCTFORM,REC=ACCT: ACCTDESC$ 02100 REM Next line changes date from yy/mm/dd to mm/dd/yy 02200 LET CDATE$ = CNVRT$(“PIC(######)”,DATE) !: LET DATE = VAL(CDATE$(3:6)&CDATE$(1:2)) 02300 PRINT #255,USING DFORM: DATE,CHECK,PAYEE$,AMOUNT,DWFLAG$, ACCTDESC$ 02400 LET TOTAL=TOTAL+AMOUNT 02500 GOTO MAINLOOP 02600 ! 02700 ALLDONE: ! End of file processing 02800 PRINT #255,USING TFORM: TOTAL 02900 CLOSE #ACCOUNTS: 03000 CLOSE #CHECKBOOK: 03100 STOP 03200 ! 03300 ! Form Statements 03400 HFORM: FORM POS 2,C 8,X 2,C 5,X 2,C 25,X 6,C 7,X 2,C 3,X 2, C 15 03500 DFORM: FORM POS 2,PIC(Z#/##/##),X 2,PIC(ZZZZ#),X 2,C 25,X 2, PIC(ZZZZZZZ#.##),X 3,C 1,X 3,C 15 03600 TFORM: FORM SKIP 1,POS 18,” Total For All Checks”,X 6,
03700 CHEKFORM: FORM N 5,N 10.2,C 1,N 6,POS 29,C 25,N 2 03800 ACCTFORM: FORM C 15
This report program uses variables for the file number. Any file processing program could use this technique; it is not limited to keyed access, or even to internal files. Instead of using integer constants to identify files (for example, #1 or #2), this program uses the variables CHECKBOOK and ACCOUNTS, which are initialized (that means they are first assigned values) in lines 900 and 1000, respectively.
These variable names are also used in the READ statements in lines 1800 and 2000. This technique makes it easy to remember which file is being read. Variable names are also used for file numbers in lines 2900 and 3000. Because ACCOUNTS was assigned a value of 2 and never changed, it is as if line 2900 were:
2900 CLOSE #2:
Column headings for this report are printed in lines 1400 and 1500. The main processing loop runs from line 1700 to 2500. The loop ends when the EOF error condition at the end of line 1800 transfers control to the label ALLDONE in line 2700, where a total is printed for all checks listed.
There are two points in this program which are necessary to the technique of reading sequentially be key. First, the OPEN statement in line 1100 specifies the KEYED method of access. Second, the READ statement in line 1800 does not have a KEY=n$ clause.
Quick Quiz 16.8
True or False:
1. When a file is opened INPUT, KEYED, the RESTORE statement without any type of KEY=n$ clause will position the file pointer at the beginning of the master file.
2. When a file is opened INPUT, KEYED, the READ statement without any type of KEY=n$ clause will read the next record in the master file.
3. When a file is to be read sequentially by key, it should be opened SEQUENTIAL.
4. When a file is to be read sequentially by key, the READ statements should include some type of KEY=n$ clause.
5. Reading a file sequentially by key is a quick way to produce a report that looks like the file has been sorted in descending order according to the key field.
6. A variable can be used for a file number in statements like OPEN, READ, REWRITE, and CLOSE; however, this technique is limited to keyed files.