Neil's Tech Tips

Thursday, December 1, 2016

Importing Data using Suprtool

Import Data Using Suprtool

One of the more recent questions that has come up lately is How Do I Import data with Suprtool. Let's say we have the following data layout, Image, Eloquence or SD file, it doesn't matter. The purpose of this small white paper would be to extract data to a ".csv" file, and then import that same data back. Closing the loop so to speak.

BIN-NO ~ J1
LAST-SHIP-DATE ~ J2
ON-HAND-QTY ~ J2
PRODUCT-NO ~ Z8
SUPPLIER-NO ~ Z8
UNIT-COST ~ P8
ITEM-DESC1 ~ X20
ITEM-DESC2 ~ X20
ITEM-DESC3 ~ X20
ITEM-DESC4 ~ X20

To begin our project we will extract the data from a sample data source and output to an SD file and then use STExport to create a CSV file.

>get d-inventory
>out dinv,link
>xeq
IN=13, OUT=13. CPU-Sec=1. Wall-Sec=1.

:run stexport.pub.robelle
$in dinv
$out dinvcsv
$xeq
In=13. Out=13. CPU-Sec=1. Wall-Sec=1.
$listf dinvcsv,2

ACCOUNT=  GREEN       GROUP=  NEIL
FILENAME  code  ------------LOGICAL RECORD-----------  ----SPACE----
                  SIZE  TYP        EOF      LIMIT R/B  SECTORS #X MX

DINVCSV           152B  VA          13         13   1       16  1  *

So at this point we know have a file, typically how a customer would want to import using Suprtool for adding to a database etc. Note that Suprtool does not handle variable-length files so the first step is to convert the Variable length file to fixed length.

Step One: Convert to Fixed Length

On MPE you can convert using Fcopy:

/file dinvcsvf;rec=-152,1,f,ascii
/fcopy from=dinvcsv; to=*dinvcsvf;new

On HP-UX you can use Qedit:

/t dinvcsv
/set keep var off
/k dinvcsvf

Or Awk: (Thanks to Barry Lake from Allegro for fixify.sh)

#!/bin/sh
#
# Script to turn a typical bytestream file (variable length
# records) into a file with fixed length records by padding
# records with spaces as needed. The resulting record length
# will be that of the longest record in the file.

export INFILE=$1
OUTFILE=$(mktemp)

# Step 1: Find the length of the longest record in the input file
#         file and store it in a variable for use in the next step.

export MAXLEN=$(awk 'BEGIN { len = 0; }
  { if (length > len)
       len = length; }
  END { print len; }' $INFILE )

echo Maximum record length in \"$INFILE\" is $MAXLEN.

# Step 2: Get the MAXLEN variable; use it to create a string
#         of that many blanks; then use that to pad each input
#         record as needed to make it the same length as the
#         longest record.

awk 'BEGIN { "echo $MAXLEN" | getline maxlen; spaces = "";
             for(i=0; i < maxlen * 1; i++)
                spaces = " "spaces; }
  { print substr($0 spaces, 1, maxlen) }' $INFILE >$OUTFILE

# Step 3: awk can't edit a file in place so its output was
#         written to a new file which we then pour back into the
#         original file. Note: we could just as easily have done
#         cp $OUTFILE $INFILE or mv $OUTFILE $INFILE, but in that
#         case we'd lose the original file's creator and
#         permission bits.

cat $OUTFILE > $INFILE
rm $OUTFILE

You can run the above script:

$./fixify.sh datafile

At this point Suprtool can now read what is essentially variable-length data in a fixed length file.

Now remember what a typical CSV file looks like! Byte fields are surronded by quotes and separated or delimited with commas:

159,19910827,1,50532001,5053,9449,"Test index","Test index","Test index","Test index"

Step Two: Split out the Fields

The first step is to separate out each field from the record based on the delimeter, which in this case is the comma. I use byte lengths for each number field based on the rules for output ,ascii table.

Duplicated below:

  I1 J1 06 bytes      
  I2 J2 11 bytes
  I3 J3 16 bytes      
  I4 J4 20 bytes
  K1    05 bytes      
  K2    10 bytes
  Zn    n+1 bytes     
  Pn    n bytes

in dinvcsvf
{define targets}
def bin-x,1,6
def last-x,1,11
def qty-x,1,11
def prod-x,1,9
def supp-x,1,9
def cost-x,1,8
def desc1-x,1,22 {Note room for quotes}
def desc2-x,1,22
def desc3-x,1,22
def desc4-x,1,22

{define source}
def record,1,152

{extract using split }
 
ext bin-x=$split(record,first,",")
ext last-x=$split(record,",",1,",")
ext qty-x=$split(record,",",2,",")
ext prod-x=$split(record,",",3,",")
ext supp-x=$split(record,",",4,",")
ext cost-x=$split(record,",",5,",")
ext desc1-x=$split(record,",",6,",")
ext desc2-x=$split(record,",",7,",")
ext desc3-x=$split(record,",",8,",")
ext desc4-x=$trim($split(record,",",9,last))

{where?}
out myfile,link
xeq

Step Three: Closing the Loop

Now we just need to "close the loop" and extract the individual byte type fields into their appropriate data types. In this step we also "clean" the data of the double quotes.

Keep in mind that the data format that we want:

BIN-NO ~ J1
LAST-SHIP-DATE ~ J2
ON-HAND-QTY ~ J2
PRODUCT-NO ~ Z8
SUPPLIER-NO ~ Z8
UNIT-COST ~ P8
ITEM-DESC1 ~ X20
ITEM-DESC2 ~ X20
ITEM-DESC3 ~ X20
ITEM-DESC4 ~ X20

The defines for the above are in the section starting with the comment {Actual targets}

in myfile

{re-define number in display format}
def bin-z,bin-x,display
def last-z,last-x,display
def qty-z,qty-x,display
def prod-z,prod-x,display
def supp-z,supp-x,display
def cost-z,cost-x,display

{Actual targets}

{number}
def bin ,1,2,integer
def last,1,4,double
def qty ,1,4,double
def prod,1,8,display
def supp,1,8,display
def cost,1,4,packed

{bytes}
def desc1,1,20 
def desc2,1,20
def desc3,1,20
def desc4,1,20 

ext bin=$number(bin-z)
ext last=$number(last-z)
ext qty=$number (qty-z)
ext prod=$number(prod-z)
ext supp=$number(supp-z)
ext cost=$number(cost-z)

clean '"'
ext desc1=$trim($clean(desc1-x))
ext desc2=$trim($clean(desc2-x))
ext desc3=$trim($clean(desc3-x))
ext desc4=$trim($clean(desc4-x))

out loop,link
xeq

The SD file (loop) is now in the same format and layout of the original Dataset extraction prior to the STExport task. We've now closed the loop.

>form loop
    File: LOOP.NEIL.GREEN     (SD Version B.00.00)
       Entry:                     Offset
          BIN                  I1      1
          LAST                 I2      3
          QTY                  I2      7
          PROD                 Z8     11
          SUPP                 Z8     19
          COST                 P8     27
          DESC1                X20    31
          DESC2                X20    51
          DESC3                X20    71
          DESC4                X20    91
    Limit: 13  EOF: 13  Entry Length: 110  Blocking: 37

So to summarize the first step is to split out the data into separate fields with byte data types. Then to convert by re-defining with a new name to reference the data as display and use $clean and $number to extract into the final targets.

Normalizing Zip Codes

Fix all of My Zip codes

Recently a customer asked how they could fix some of the Zip codes which were in the form of:

123450000

and are stored in a numeric J2 field.

The customer wanted to normalize those zip codes that had the four trailing zeroes to be 12345 instead of 123450000.

The first step we wanted to do was to determine all of the codes that needed to be updated and what the old zip would be and what the new zip would be.

 >base membrs
 >get member-file
 >def new-zip,1,4,double
 >if zip > 99999 and (zip mod 10000) = 0
 >ext account
 >ext zip
 >ext new-zip = zip / 10000
 >list
 >xeq

So what is the above doing?

Well the if command looks at all records that are greater than 99999 and ends in the four zeroes, which is what the (zip mod 10000) = 0 is doing. This should isolate just those records that the customer wanted to fix!

Once we determined that we had the correct records selected we easily updated them with:

 >base membrs
 >get member-file
 >if zip > 99999 and (zip mod 10000) = 0
 >update
 >ext new-zip = zip / 10000
 >list
 >xeq

How Do I Convert CM KSAM to NM KSAM

How Do I Convert CM Ksam to NM KSAM?

Suprtool can't/won't create a new NM KSAM file for you. You will have to create the new NM file first using the MPE BUILD command (or some other method, see below), then use Suprtool to copy the records from the old file to the new file.

But Suprtool should definitely be able to copy the data *much* faster than MPE's FCOPY utility could.

The trick, of course, is getting a new NM KSAM file built correctly first. You could use a complicated BUILD command or you could do this:

 :file n=newfile; disc=
 :fcopy from=oldfile; to=(*n); subset=0,1

That will create a new NM KSAM file called NEWFILE with the FLIMIT equal to the FLIMIT of your current CM file, and with all the keys set up correctly, but will copy only a single record into it. Then use Suprtool:

 :suprtool
 >in oldfile
 >out newfile, erase
 >xeq

Finally, if it's a big KSAM file you may wish to create it as temporary first, then SAVE it after you're done, to squeeze even more speed out of the process:

 :file n=newfile; disc=; temp     <<-- Note the temp designation
 :fcopy from=oldfile; to=(*n); subset=0,1
 :suprtool
 >in oldfile
 >out newfile, erase
 >exit
 :save newfile

Why is my MPE job suddenly failing?

Why is my Job Suddenly Failing?

I had a recent call in to support asking why all of a sudden some jobs started failing in a job stream. The report was a blank line after run of a program suddenly started giving the error:

Missing colon before command name. (CIERR 981)

In previous incarnations the job stream with the blank line would NOT fail but nothing changed... of course. :)

First off I want to confirm that a blank line does abort a job stream.

:comment bland line is next Missing colon before command name. (CIERR 981) REMAINDER OF JOB FLUSHED. CPU sec. = 1. elapsed min. = 1. THU, FEB 4, 2010, 10:47 AM.

And I found it does.

The customer though showed that the blank line was after the run of a program and in this case I am using query as an example. We didn't see the same behaviour. Odd.

	:comment
	:comment test blank line after run query.pub.sys
	:comment
	:purge file1x
	:query


	HP32216N.03.18  QUERY/NM  THU, FEB  4, 2010, 10:18 AM
	COPYRIGHT HEWLETT-PACKARD CO. 1976

	exit


	END OF PROGRAM
	:showjcw jcw
	JCW = 0
	:showjcw cierror
	CIERROR = 0
	:eoj
	CPU sec. = 2.  elapsed min. = 1.  THU, FEB  4, 2010, 10:18 AM.

Some investigation and some clues from something I had remembered investigating in 2001 allowed me to duplicate by switching to query being run/resolved via a command file.

In the above example even though the commands were the same, query was being run via path resolution. This is the part that was tricky and I later make the examples explicit by specifically running query.pub.sys or using my query command file:

	:comment
	:comment  Blank line after query which resolves to a command file
	:comment
	:query


	HP32216N.03.18  QUERY/NM  THU, FEB  4, 2010, 10:20 AM
	COPYRIGHT HEWLETT-PACKARD CO. 1976

	exit


	END OF PROGRAM

	Missing colon before command name. (CIERR 981)
	REMAINDER OF JOB FLUSHED.

Another example but this time with garbage commands:

	:comment query command file with garbage after exit
	:query


	HP32216N.03.18  QUERY/NM  THU, FEB  4, 2010, 10:44 AM
	COPYRIGHT HEWLETT-PACKARD CO. 1976

	exit


	END OF PROGRAM
	yabba dabba doo
	Missing colon before command name. (CIERR 981)
	REMAINDER OF JOB FLUSHED.
	CPU sec. = 2.  elapsed min. = 1.  THU, FEB  4, 2010, 10:44 AM.

Now I switched back to running directly with the yabba dabba doo still in the job and some other lines:

	:comment
	:run query.pub.sys


	HP32216N.03.18  QUERY/NM  THU, FEB  4, 2010, 10:45 AM
	COPYRIGHT HEWLETT-PACKARD CO. 1976

	exit


	END OF PROGRAM
	:showjcw jcw
	JCW = 0
	:showjcw cierror
	CIERROR = 383
	:eoj
	CPU sec. = 2.  elapsed min. = 1.  THU, FEB  4, 2010, 10:45 AM.

In the case of a run of the program file the garbage lines after the run are being read as $stdin until it sees a "!". So what happens is the program exits and the extra lines left in the stdin are thrown away.

However, if you switch this run to a command file, suddenly the job stops running to completion.

So, if you have a customer that switches from say

query

which is run thru HPPATH resolution, to having

query

be resolved via a command file, the jobs with blank lines or garbage after the exit line will suddenly stop failing.

It's a bit tricky to diagnose especially if the customer doesn't believe you or points to something else. I have not tested UDC's but I would believe that they would have the same behaviour. My guess is that when a command file ends MPE just reads the next line in the job stream, but when a program ends the job stream is trying to re-orient itself in regard to stdin and just reads until is sees a "!". That is the behaviour that I am seeing.

Any past labbies or gurus care to comment or have thoughts.

While loops in Qedit

How do I do a while loop in Qedit/UX?

There are occasions when you need to do a while loop to repeat multiple edit commands in Qedit, typically the edit operations are not on the same line as the condition or data you want to change or edit.

Recently we had the following request:

I am using Qedit for HPUX. I have a file where I would like to execute the following qedit commands until end of file is reached:

fq 707C(1/4)
lq * - 1
c 5/5 C
lq * + 1

Basically, I am trying to change the line previous to one with 707C in positions 1 to 4 for any occurrence of a line with 707C in positions 1 to 4. Is that possible?

Naturally, I wanted to help the customer get the job done, so here is what I came up with in a file called mytest.

rm qedituse
export lines=$(grep '^707C' file | wc -l)
echo tq file >>qedituse
echo lq [    >>qedituse
x=0
while [ $x -lt $lines ]; do 
echo "fq '707C'(1/4);cq 5/5 'C' *-1;lq *+1"  >>qedituse
x=`expr $x + 1`
done
echo k newfile,yes >> qedituse
echo exit          >> qedituse
qedit -c'use qedituse'

The resulting file had the lines with C in column 5 preceeding the 707C. blinC 707C ack ick bla asdfC 707C hippy hoppy frippy froppy a 707 ha ack C 707C Neo%/home/neil: The process broken down as follows: The following just removes the file qedituse and then stores in a variable called lines the number of lines that start with 707C.

rm qedituse
export lines=$(grep '^707C' file | wc -l)

The next two lines simply text in the file and positions the file pointer at the beginning of the file.

echo tq file >>qedituse
echo lq [    >>qedituse

The next portion, is the while loop that will go thru the file "lines" number of times and issue the desidred qedit commands. Notice that cq 5/5 'C' *-1, means put a C in column 5 of the preceding line.

x=0
while [ $x -lt $lines ]; do 
echo "fq '707C'(1/4);cq 5/5 'C' *-1;lq *+1"  >>qedituse
x=`expr $x + 1`
done

The next two lines keep the file and exit

echo kq newfile,yes >> qedituse
echo exit           >> qedituse

The following is run qedit with the editor commands and actually do the task.

qedit -c'use qedituse'

Given what I learned now I would write things a little more compact and do as follows:

rm qedituse
echo t file;l [ >>qedituse
export lines=$(grep '^707C' file | wc -l)
x=0
while [ $x -lt $lines ]; do 
echo "fq '707C'(1/4);cq 5/5 'C' *-1;lq *+1"  >>qedituse
x=`expr $x + 1`
done
echo k newfile,yes;exit >> qedituse
qedit -c'useq qedituse'

You could also use parms and variables to make this even more powerful by allowing you to specify the string to search for etc.

Difference Between Two Dates

How do I find the difference between two dates

While on training recently I was asked by a customer to find the difference between a due-date for a payment and the current date. However, they only wanted to list the dates that were overdue as the number of days overdue, and if the payment was not overdue then they should show a zero as the days difference.

Now the best way to calculate a difference between two days is to convert the date using the $days function and subtract the two dates.

The $days function converts any date to the number of days since 4713 BC, and is known as Julian Day number. Therefore you can perform or figure out the difference between two dates. The task below has a couple of tricks, the first is to use Suprtool to build a command file to get the current date into a variable in Julian Day format!

The second part of the script figures out the difference between the two dates, keep in mind that the customer wants to know which dates are "overdue" and only want to see those that are overdue, so we re-order the subtraction such that overdue payments will be a positive number and those payments that are not yet due, will be a negative number.

rm tdays tdays.sd
rm setdate
export EQ_DBSERVER=":8202"
suprtool << EOD_Suprtool
{ Set $mydate variable to todays date in julian day format }
base membrs,5,lookup
get member-file
num 1
def tdays,1,4,double
item tdays,date,julian
ext tdays=\$today
out tdays,link
xeq
in tdays
def tdayascii,1,7,byte
ext 'export MYJULDAY="'
ext tdayascii=\$edit(tdays,"zzzzzzz")
ext '"'
out setdate
exit
EOD_Suprtool
chmod +x setdate
. ./setdate
rm setdate
suprtool << EOD_Suprtool
set varsub on
in dates
def diff,1,4,double
item a,date,ccyymmdd
ext a
ext diff=(\$MYJULDAY - \$DAYS(A))
list
xeq
exit
EOD_Suprtool

Given some dates here is what you would see:

 >IN dates (0) >OUT $NULL (0)
A               = 20131213       DIFF            = 47

>IN dates (1) >OUT $NULL (1)
A               = 20131231       DIFF            = 29

>IN dates (2) >OUT $NULL (2)
A               = 20140201       DIFF            = -3

>IN dates (3) >OUT $NULL (3)
A               = 20140130       DIFF            = -1

>IN dates (4) >OUT $NULL (4)
A               = 20140115       DIFF            = 14

IN=5, OUT=5. CPU-Sec=1. Wall-Sec=1.

>exit

Now since we only want to see positive numbers, if we change the target type to be logical, we will only get the difference in days with a positive number and the negatives will become zero since the definition of a logical number is that it cannot be negative.

 Neo%dev/source/suprtool: ./datediff
rm: setdate non-existent
SUPRTOOL/UXia/Copyright Robelle Solutions Technology Inc. 1981-2014.
(Version 5.6 Internal)  WED, JAN 29, 2014, 11:55 AM  Type H for help.
Build 11
>{ Set  variable to todays date in julian day format }
>base membrs,5,lookup
>get member-file
>num 1
>def tdays,1,4,double
>item tdays,date,julian
>ext tdays=$today
>out tdays,link
>xeq

Warning:  NUMRECS exceeded; some records not processed.
IN=2, OUT=1. CPU-Sec=1. Wall-Sec=1.

>in tdays
>def tdayascii,1,7,byte
>ext 'export MYJULDAY="'
>ext tdayascii=$edit(tdays,"zzzzzzz")
>ext '"'
>out setdate
>exit
IN=1, OUT=1. CPU-Sec=1. Wall-Sec=1.

SUPRTOOL/UXia/Copyright Robelle Solutions Technology Inc. 1981-2014.
(Version 5.6 Internal)  WED, JAN 29, 2014, 11:55 AM  Type H for help.
Build 11
>set varsub on
>in dates
>def diff,1,4,logical  {note target is now logical}
>item a,date,ccyymmdd
>ext a
>ext diff=($MYJULDAY - $DAYS(A))
>list
>xeq
>IN dates (0) >OUT $NULL (0)
A               = 20131213       DIFF            = 47

>IN dates (1) >OUT $NULL (1)
A               = 20131231       DIFF            = 29

>IN dates (2) >OUT $NULL (2)
A               = 20140201       DIFF            = 0

>IN dates (3) >OUT $NULL (3)
A               = 20140130       DIFF            = 0

>IN dates (4) >OUT $NULL (4)
A               = 20140115       DIFF            = 14

IN=5, OUT=5. CPU-Sec=1. Wall-Sec=1.

>exit

Normalizing Data

How do I Normalize Phone Numbers?

Recent question about Phone Numbers and how to remove non-number characters from a byte container, raised some interesting solutions to normalizing phone numbers:

Considering the following data, you see that the phone numbers have all sorts of different formats.

>in myphone
>list
>xeq
>IN myphone (0) >OUT $NULL (0)
PHONENUM        = #123.456.7890

>IN myphone (1) >OUT $NULL (1)
PHONENUM        = (123)567-1234

>IN myphone (2) >OUT $NULL (2)
PHONENUM        = (321).123.5678

IN=3, OUT=3. CPU-Sec=1. Wall-Sec=1.

The steps in normalizing the data is to remove the non-numeric numbers:

>in myphone
>set cleanchar ""
>clean "^0:^47","^58:^255"
>def newphone,1,14
>ext phonenum=$clean(phonenum)
>out newphone,link
>xeq
IN=3, OUT=3. CPU-Sec=1. Wall-Sec=1.

>in newphone
>list
>xeq
>IN newphone (0) >OUT $NULL (0)
PHONENUM        = 1234567890

>IN newphone (1) >OUT $NULL (1)
PHONENUM        = 1235671234

>IN newphone (2) >OUT $NULL (2)
PHONENUM        = 3211235678

IN=3, OUT=3. CPU-Sec=1. Wall-Sec=1.

You can then use an edit mask to format it in the same way. You do need to redefine the field being edited with a define of the number with just the length of the phone number:

>in newphone
>form
    File: newphone     (SD Version B.00.00)  Has linefeeds
       Entry:                     Offset
          PHONENUM             X14     1
    Entry Length: 14  Blocking: 1
>def my,phonenum,10
>def targ,1,12
>ext targ=$edit(my,"xxx.xxx.xxxx")
>list
>xeq
>IN newphone (0) >OUT $NULL (0)
TARG            = 123.456.7890

>IN newphone (1) >OUT $NULL (1)
TARG            = 123.567.1234

>IN newphone (2) >OUT $NULL (2)
TARG            = 321.123.5678

IN=3, OUT=3. CPU-Sec=1. Wall-Sec=1.