Welcome, Guest.
Please login or register.
zip31b wrong Unix end-of-line char X'85'
Forum Login
Login Name: Create a new account
Password:     Forgot password

Info-ZIP Discussion Forum    Info-ZIP Bugs    Zip Bugs  ›  zip31b wrong Unix end-of-line char X'85'

zip31b wrong Unix end-of-line char X'85'   This thread currently has 510 views. Print
1 Pages 1 Recommend Thread
fits
July 9, 2010, 10:49am Report to Moderator
Baby Member
Posts: 17
on zip31b when ommit -l in parm card wrong Unix end-of-line char X'85' is used. On zip232 the correct Unix end-of-line char X'0A' is used. Is there a way to set a correct default ?.  currently we use -ll to get the correct result.
Josef 
Logged
Private Message
EG
July 11, 2010, 8:02am Report to Moderator
Info-ZIP Team
Posts: 463
Can you provide step-by-step details of how to recreate an example of the problem?  Also, is this still a problem with Zip 3.1c that just went out?
Logged
Private Message Reply: 1 - 11
fits
July 12, 2010, 8:42am Report to Moderator
Baby Member
Posts: 17
It is still a problem with zip 3.1c.
to recreate the problem I tried following steps with new zip 3.1c:
1.) compiled the new zip3.1c verision with make -f mvs.mki command
2.) executed following zip batch job with -a option:
//ZIPTEST JOB (#ACCT,IZ),'ZIPTEST',CLASS=P,MSGCLASS=T,NOTIFY=&SYSUID 
//ZIP      EXEC PGM=ZIP,                                           
// PARM='/ -a dd:archive [url=mailto:-@']-@'[/url]                                          
//STEPLIB  DD  DSN=IZN.XMITIP.LOAD,DISP=SHR                           
//SYSPRINT DD  SYSOUT=*                                               
//SYSOUT   DD  SYSOUT=*                                               
//CEEDUMP  DD  SYSOUT=*                                               
//ARCHIVE  DD  DSN=TEST.UNZIP.ARCHIV07.ZIP,DISP=(,CATLG,DELETE),   
//             SPACE=(CYL,(1,1),RLSE),                                
//             MGMTCLAS=DEL30T                                        
//SYSIN    DD  *                                                      
'TEST.UNZIP.ZIPIN.KOERNER.VAR'                                     
//*    
3.) copied the archive "TEST.UNZIP.ARCHIV07.ZIP" to my PC
4.) extracted the file /TEST/UNZIP/ZIPIN/KÖRNER.VAR
5.) Put the extracted file  /TEST/UNZIP/ZIPIN/KÖRNER.VAR in binary to z/OS
6.) viewed the file in TSO in HEX with DISPLAY ASCII Command with following result:
 ------------------------------------------------------------------------------
MarzipanNr;StatusDatum;Sp.Datum;Primõrbetreuer;Referenznummer;UnterNr;Kontoart;K
467767664735767774677635724677635766E7667767673566676676766673567674734667667734
D12A901EE2B3414534145DB30E4145DB029D4225425552B256525EAE5DD52B5E452E2BBFE4F124BB
 ------------------------------------------------------------------------------
chaden;Bruttoschadenà91001654;20091016;20091016;90243/Fr.Auer;1/0063/110/00;0;6;
66666634777767666666833333333333333333333333333333333247247673323333233323333333
38145EB22544F338145E591001654B20091016B20091016B90243F62E1552B1F0063F110F00B0B6B
 ------------------------------------------------------------------------------ 
7.) Looking after the word "Bruttoschaden", we can see Line delimeter is X'85'
8.) doing the same procedure with ZIP232 we can see the correct result of X'0A'
 ------------------------------------------------------------------------------
MarzipanNr;StatusDatum;Sp.Datum;Primõrbetreuer;Referenznummer;UnterNr;Kontoart;K
467767664735767774677635724677635766E7667767673566676676766673567674734667667734
D12A901EE2B3414534145DB30E4145DB029D4225425552B256525EAE5DD52B5E452E2BBFE4F124BB
 ------------------------------------------------------------------------------
chaden;Bruttoschaden.91001654;20091016;20091016;90243/Fr.Auer;1/0063/110/00;0;6;
66666634777767666666033333333333333333333333333333333247247673323333233323333333
38145EB22544F338145EA91001654B20091016B20091016B90243F62E1552B1F0063F110F00B0B6B

So I think the old ZIP232 Verstion uses X'0A' Line delimeter with option -a which is the correct value on unix. It makes sense to use the same delimeter on the new ZIP31c version. Currently I could only cirumvent this problem when using the option -all on zip31c. But this means someone who upgrades from ZIP232 to ZIP31C has to change in all his ZIP Jobs the -a option to -all. So it seems to me easier to use the old correct Line Delimeter as it was in ZIP232
Logged
Private Message Reply: 2 - 11
EG
July 13, 2010, 12:20am Report to Moderator
Info-ZIP Team
Posts: 463
Looks reasonable.

I posted a note here:

  http://www.info-zip.org/board/board.pl?m-1197474780/s-60/#num72

to see if anyone reading that has issues with this change.
Logged
Private Message Reply: 3 - 11
EG
July 13, 2010, 12:57am Report to Moderator
Info-ZIP Team
Posts: 463
Looking this over, I have some questions.

Quoted from fits
2.) executed following zip batch job with -a option:
//ZIPTEST JOB (#ACCT,IZ),'ZIPTEST',CLASS=P,MSGCLASS=T,NOTIFY=&SYSUID 

...
'TEST.UNZIP.ZIPIN.KOERNER.VAR'                                     
//*    

Can you attach the file KOERNER.VAR, assuming that's the input file and there's nothing sensitive about its contents?

Quoted from fits
4.) extracted the file
/TEST/UNZIP/ZIPIN/KÖRNER.VAR
5.) Put the extracted file  /TEST/UNZIP/ZIPIN/KÖRNER.VAR in binary to z/OS

Is this the same file that was added to the archive above?  Seems the names are different.  How was the file moved?  Any chance that what moved it did any line end conversions?

Quoted from fits
6.) viewed the file in TSO in HEX with DISPLAY ASCII Command
with following result:
 ------------------------------------------------------------------------------

MarzipanNr;StatusDatum;Sp.Datum;Primõrbetreuer;Referenznummer;UnterNr;Kontoart;K
467767664735767774677635724677635766E7667767673566676676766673567674734667667734
D12A901EE2B3414534145DB30E4145DB029D4225425552B256525EAE5DD52B5E452E2BBFE4F124BB
 ------------------------------------------------------------------------------

chaden;Bruttoschadenà91001654;20091016;20091016;90243/Fr.Auer;1/0063/110/00;0;6;
66666634777767666666833333333333333333333333333333333247247673323333233323333333
38145EB22544F338145E591001654B20091016B20091016B90243F62E1552B1F0063F110F00B0B6B
 ------------------------------------------------------------------------------ 
7.)  Looking after the word "Bruttoschaden", we can see Line delimeter is X'85'
8.) doing the same procedure with ZIP232 we can see the correct result of X'0A'
 ------------------------------------------------------------------------------

MarzipanNr;StatusDatum;Sp.Datum;Primõrbetreuer;Referenznummer;UnterNr;Kontoart;K
467767664735767774677635724677635766E7667767673566676676766673567674734667667734
D12A901EE2B3414534145DB30E4145DB029D4225425552B256525EAE5DD52B5E452E2BBFE4F124BB
 ------------------------------------------------------------------------------

chaden;Bruttoschaden.91001654;20091016;20091016;90243/Fr.Auer;1/0063/110/00;0;6;
66666634777767666666033333333333333333333333333333333247247673323333233323333333
38145EB22544F338145EA91001654B20091016B20091016B90243F62E1552B1F0063F110F00B0B6B


So I think the old ZIP232 Verstion uses X'0A' Line delimeter with option -a which is the correct value on unix. It makes sense to use the same delimeter on the new ZIP31c version.
Currently I could only cirumvent this problem when using the option -all
on zip31c. But this means someone who upgrades from ZIP232 to ZIP31C
has to change in all his ZIP Jobs the -a option to -all. So it seems to
me easier to use the old correct Line Delimeter as it was in ZIP232
I'm not sure where the X'85' is coming from?

Can you put together a very small text file that recreates this problem?  Just lines like:

123456789a
123456789b
123456789c

That should be enough to verify the line ends.  Then attach the original file, the archive with it, and the extracted file.  Note where the zipping and the unzipping was done and all parameters.
Logged
Private Message Reply: 4 - 11
fits
July 14, 2010, 11:26am Report to Moderator
Baby Member
Posts: 17
Hello,
have created a three Line test file attached as ebcdic_117.txt. I have it also attached in ascii as ascii_1535.txt.
This input file was zipped with zip31c to the output archive attached as ebcdic_6161.zip. ebcdic_6161.zip was then
binary downloaded to PC and extracted with winzip to attached file extract_9834.txt. The extracted file was then uploaded in binary to z/os and displayed in hex and looks like this:
 ------------------------------------------------------------------------------
Line1àLine2àEndà................................................................
46663846663846680000000000000000000000000000000000000000000000000000000000000000
C9E515C9E5255E450000000000000000000000000000000000000000000000000000000000000000
 ------------------------------------------------------------------------------ 

in the display we can see, the line delimeter is X'85'
The Job for zip was used looks like this:
//IZ007601 JOB (520000,IZ),'BERGER',CLASS=P,MSGCLASS=T,NOTIFY=&SYSUID  
//DELDSN   EXEC PGM=IDCAMS                                             
//SYSPRINT DD SYSOUT=*                                                 
//SYSIN    DD *                                                        
  DEL IZ00760.UNZIP.INPUT.EBCDIC.ZIP                                   
  SET MAXCC=0                                                          
/*                                                                     
//* NOTE THE PARAMETER LINE IS LOWERCASE, EXCEPT FOR -B                
//* -V VERBOSE MODE                                                    
//* -A COMPRESS THE FILE AS ASCII                                      
//* -L TRANSLATE THE UNIX END-OF-LINE CHAR LF INTOMSDOS CRLF           
//* -J INDICATES NOT TO SAVE THE PATH, JUST THE FILENAME               
//* -K ATTEMPT TO MAKE NAMES MSDOS COMPATIBLE                          
//* -B COMPRESS THE FILE AS BINARY, NOT EBCDIC                         
//* DD:ARCHIVE INDICATES USE DD STATEMENT //ARCHIVE AS OUTPUT ZIP FILE 
//* -@ INDICATES TO READ THE NAMES OF THE FILE TO ZIP FROM //SYSIN     
//ZIP      EXEC PGM=ZIP,                                               
// PARM='/ -a dd:archive [url=mailto:-@']-@'[/url]                                           
//STEPLIB  DD  DSN=IZ00760.INFOZIP.LOAD,DISP=SHR                       
//SYSPRINT DD  SYSOUT=*                                                
//SYSOUT   DD  SYSOUT=*                                                
//CEEDUMP  DD  SYSOUT=*                                                
//ARCHIVE  DD  DSN=IZ00760.UNZIP.INPUT.EBCDIC.ZIP,DISP=(,CATLG,DELETE),
//             SPACE=(CYL,(1,1),RLSE),                                 
//             MGMTCLAS=DEL30T                   
//SYSIN    DD  *                                 
'IZ00760.UNZIP.INPUT.EBCDIC.TXT'                 
//*                                              

hope this helps,
Josef 
 
Quoted from EG
Looking this over, I have some questions.


Can you attach the file KOERNER.VAR, assuming that's the input file and there's nothing sensitive about its contents?


Is this the same file that was added to the archive above?  Seems the names are different.  How was the file moved?  Any chance that what moved it did any line end conversions?

I'm not sure where the X'85' is coming from?

Can you put together a very small text file that recreates this problem?  Just lines like:

123456789a
123456789b
123456789c

That should be enough to verify the line ends.  Then attach the original file, the archive with it, and the extracted file.  Note where the zipping and the unzipping was done and all parameters.

 



Attachment: ebcdic_1117.txt
33 downloads   -   Size: 0.01 KB

Attachment: ascii_1535.txt
31 downloads   -   Size: 0.02 KB

Attachment: ebcdic_6161.zip
27 downloads   -   Size: 0.17 KB

Attachment: extract_9834.txt
32 downloads   -   Size: 0.02 KB

Logged
Private Message Reply: 5 - 11
fits
July 14, 2010, 12:34pm Report to Moderator
Baby Member
Posts: 17
Hello,
sorry it's my mistake!. I've changed codepage translation file ebcdic.c  and have forgotten to put Paul_von_Behren changes in it.  
In the origin file ebcdic.c (uses cp1047,cp0819)
ascii section translated X'15' -> X'0A'
                                 X'25' -> X'85'
ebcdic section translated X'0A' -> X'15'
                                   X'85' -> X'25'
My modified file ebcdic.c (uses cp1141,cp1252)
ascii section translated x'15' -> X'85'
                                 x'25' -> X'0A'
ebcdic section translated X'0A' -> X'25'
                                   X'85' -> X'15'
regards,
Josef
 
 
Logged
Private Message Reply: 6 - 11
EG
July 14, 2010, 4:07pm Report to Moderator
Info-ZIP Team
Posts: 463
Thanks for providing the extra data.  That was helpful!

I see in archive ebcdic_6161.zip the following file data (from my home grown utility):
  --- file data ---
       4c 69 6e 65 31 0a 4c 69 6e 65 32 0a 45 6e 64 0a
       L  i  n  e  1     L  i  n  e  2     E  n  d
  --- end of file data ---

The line ends here look correct.  So it looks like the process of getting the file back to z/OS is suspect.

As it looks like you've found the problem, I guess we can consider this one solved.
Logged
Private Message Reply: 7 - 11
Al Dunsmuir
July 14, 2010, 5:12pm Report to Moderator
Info-ZIP Team
Posts: 94
Folks,

Translations from EBCDIC to ASCII using the z/OS Language Environment (C runtime) facilities (especially iconv for translations) are defined  to return the ASCII NEL (0x85) character.

I would expect the same thing to happen on a Linux system where an interface returns a UTF-8 character string with an End-Of-Line.

If you want specific character sequences at End-Of-Line then code has to be added to look for the NEL and translate to LF (or CR+LF) as required. 

A lot of this stuff should be platform-specific, and in the case of z/OS the default behaviours may differ between the MVS and USS environments.

Al
Logged
Private Message Reply: 8 - 11
Al Dunsmuir
July 14, 2010, 5:23pm Report to Moderator
Info-ZIP Team
Posts: 94
By the way, on z/OS there are two different line termination conventions for bytestream data from historical heritages:
- CR (x25) + LF (x0A) from Bisync terminals (TTYs)
- NL (x15) from VM and C/C++ compiler.

Defaults should be the standard character sets expected - ISO-8859-1 for ASCII, IBM-037 for MVS EBCDIC and IBM-1047 for USS EBCDIC.  Hard-coding any other character sets is a recipe for disaster... by all means add options to support them, but ensure that other locale-specific code is also dynamically changed to match.

BTW, I remember one thing discussed that made me cringe - depending on or setting the _EBCDIC preprocessor symbol.  This (like anything else with a leading '_') is a internal control symbol between the C/C++ compiler and runtime.  It is generated by the compiler, and used to pass the default character set from the ASCII|NOASCII compiler swiitch.  It is used to also pick LE runtime routines... which may or not have the support that ZIP/UNZIP need and that may or may not assume that the current character set is static.  Now that the compiler has added support for Unicode literals, it gets even more complex.

z/OS is a really rough platform for to play with character sets and runtime functions... lots of mine fields and pitfalls.
Al
Logged
Private Message Reply: 9 - 11
fits
July 16, 2010, 6:37am Report to Moderator
Baby Member
Posts: 17
Al,
is there a possible way in a future release of zip/unzip to replace the ebcdic.c with a call of iconv ?. if one of the future releases of  zip/unzip supports a keyword FROMCODE and TOCODE in Parm card like iconv, someone can use translation for his needs.
change errors of ebcdic.c are in this case not possible.
Sample of iconv in a z/OS Batch environment:
//ICONVPRO  EXEC PGM=EDCICONV,REGION=2048K,       
//          PARM=('FROMCODE(IBM-1047),TOCODE(ISO8859-1)')

regards,
Josef
Logged
Private Message Reply: 10 - 11
Al Dunsmuir
July 17, 2010, 9:30pm Report to Moderator
Info-ZIP Team
Posts: 94
Josef,

I see that as being an extremely general requirement - requesting iconv translation of data with specific "from" and "to" code pages.  It's just that with ASCII<>EBCDIC translation us mainframe folks need it more frequently.

I think we can come up with a syntax that is a little closer to the UNIX standard, so that Ed and crew will approve.  
Al
Logged
Private Message Reply: 11 - 11
1 Pages 1 Recommend Thread
Print

Info-ZIP Discussion Forum    Info-ZIP Bugs    Zip Bugs  ›  zip31b wrong Unix end-of-line char X'85'