Welcome, Guest.
Please login or register.
Unzip 6.0 is missing option -O
Forum Login
Login Name: Create a new account
Password:     Forgot password

Info-ZIP Discussion Forum    Info-ZIP Bugs    UnZip Bugs  ›  Unzip 6.0 is missing option -O

Unzip 6.0 is missing option -O  This thread currently has 2701 views. Print
1 Pages 1 Recommend Thread
Nodame
July 20, 2009, 10:46am Report to Moderator
Baby Member
Posts: 8
Option -O that allows you to set an encoding for filenames is missing in the latest release.


To test I made a small zip file in Windows XP that has filenames encoded in shift-jis and tried to open it in Linux in UTF8 environment. I have attached the .zip to this post.

The following example will show what happens with and without -O.

Code
$ unzip -l Zip_Test.zip 
Archive:  Zip_Test.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
        0  07-20-09 11:21   Zip_Test/
       37  07-20-09 11:21   Zip_Test/РVЛKГeГLГXГg ГhГLГЕГБГУГg.txt
 --------                   -------
       37                   2 files

$ unzip -O shift-jis -l Zip_Test.zip
Archive:  Zip_Test.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
        0  07-20-09 11:21   Zip_Test/
       37  07-20-09 11:21   Zip_Test/新規テキスト ドキュメント.txt
 --------                   -------
       37                   2 files




Attachment: zip_test_3470.zip
373 downloads   -   Size: 0.30 KB

Logged
Private Message
EG
August 16, 2009, 1:53am Report to Moderator
Info-ZIP Team
Posts: 331
What version of unzip are you using (unzip -v should list it) and what are you using to create the archive?

The tendency in the zip community is to store UTF-8 in the archive now so that issues of character set conversion (like knowing the names of the from and to character sets) are mostly gone.  Currently Zip 3.0 and later support Unicode encoding of paths and UnZip 6.0 and later mostly can handle recreating Unicode paths.
Logged
Private Message Reply: 1 - 11
Nodame
August 20, 2009, 4:47pm Report to Moderator
Baby Member
Posts: 8
The one I did the above with was the last one where -O was still in which is 5.52. The one where -O is missing is 6.0.

This archive was made with 7z on a Japanese Windows XP with shift-jis filesystem encoding and the commands were run on Linux with UTF8 encoding.

UTF8 to UTF8 works, of course. But non-UTF8 files are plentiful.
Logged
Private Message Reply: 2 - 11
sms
August 20, 2009, 7:29pm Report to Moderator
Info-ZIP Team
Posts: 371
> The one I did the above with was the last one where -O was still in
> which is 5.52. The one where -O is missing is 6.0.

   I don't see a "-O" option in the normal UnZip 5.52 code.  Where did
you get your UnZip program?  Is the source available?

   Actual "unzip -v" output might be interesting.
Logged
Private Message Reply: 3 - 11
Nodame
August 20, 2009, 10:47pm Report to Moderator
Baby Member
Posts: 8
Ah now I get it, I was using the 5.52 Archlinux package and was about to post the package build, but now I see that the -O is from a patch. In the first release of 6.0 the maintainer forgot to include the patch. Now I see the latest version has the patch again. I had not bothered with unzip updates sticking with 5.52, I just updated to try it out and -O works.
So thanks, I can enjoy your 6.0 release now. And thanks patch writers.
Logged
Private Message Reply: 4 - 11
sms
August 20, 2009, 11:10pm Report to Moderator
Info-ZIP Team
Posts: 371
   Where did you get the patch?
Logged
Private Message Reply: 5 - 11
Nodame
August 20, 2009, 11:30pm Report to Moderator
Baby Member
Posts: 8
Logged
Private Message Reply: 6 - 11
noldor
September 14, 2009, 2:17pm Report to Moderator
Baby Member
Posts: 1
Quoted from sms
   Where did you get the patch?

Another one very similar patch is listed here: http://sisyphus.ru/en/srpm/Sisyphus/unzip/patches.

I used only 5.52 and its patch unzip-5.52-alt-natspec.patch has the effect. My archive with cyrillic file names is extracted and viewed correctly without some special "-O" option. In this patch - the encoding is chosen automatically.
Logged
Private Message Reply: 7 - 11
EG
September 18, 2009, 6:34am Report to Moderator
Info-ZIP Team
Posts: 331
Briefly going through the (sisyphus) patch it appears to be a version of a similar iconv patch for UnZip that was proposed a while back but the UnZip maintainer rejected.  Myself, I don't have any problem with including this feature, but the current industry trend is to move to total UTF-8 paths.  We've been a bit more sluggish, trying to maintain backward compatibility with existing archives as we move to that.

That said, it's hard to say if we should include this patch in the main release.  Given the UnZip maintainer probably would not accept the patch anyway (since he rejected it before), it might be an uphill battle.

It might be worth putting a pointer on the web site to the patch though.  If you all had to pick the primary place to download the patch, which would it be?  Also, are there instructions anywhere?
Logged
Private Message Reply: 8 - 11
darehanl
September 24, 2009, 3:20am Report to Moderator
Baby Member
Posts: 2
Any updates on this? The lack of being able to unzip Windows archives is really critical for my daily use; I went as far was port (poorly) the old altlinux patch to 6.0 here:
http://bugs.archlinux.org/task/15256

but I'd much rather have someone who actually knows the code implement this functionality. If the "-O charset" patch isn't accepted, does InfoZip have a "recommended" method of working around these non-unicode zip files? Is there something like a converter from legacy zip files to the new UTF-8 zip files? An endless sequence of:
  inflating: ?-??+- ??+??? -?10+?s_.txt
is extremely annoying.

EG// You can apply the patch like this (The altlinux sisphus patch requires libnatspec):
$ cd unzip60
$ patch -Np1 -i unzip60-alt-iconv-utf8.patch
Logged
Private Message Reply: 9 - 11
EG
September 26, 2009, 4:23am Report to Moderator
Info-ZIP Team
Posts: 331
Quoted from darehanl
Any updates on this? The lack of being able to unzip Windows archives is really critical for my daily use; I went as far was port (poorly) the old altlinux patch to 6.0 here:
http://bugs.archlinux.org/task/15256

but I'd much rather have someone who actually knows the code implement this functionality. If the "-O charset" patch isn't accepted, does InfoZip have a "recommended" method of working around these non-unicode zip files? Is there something like a converter from legacy zip files to the new UTF-8 zip files? An endless sequence of:
  inflating: ?-??+- ??+??? -?10+?s_.txt
is extremely annoying.

Agree.

Quoted from darehanl
EG// You can apply the patch like this (The altlinux sisphus patch requires libnatspec):
$ cd unzip60
$ patch -Np1 -i unzip60-alt-iconv-utf8.patch

Thanks.  However, it's the decision of the UnZip maintainer and so far he hasn't accepted adding this capability.  Could try again, though.

Another possibility is to add the patch to our site, after looking it over and doing some testing.  That assumes there are no issues with us distributing the patch and any required files.

I haven't looked at the license issues on either patch.  To use the code it would have to be distributable under the Info-ZIP license.  What are the license restrictions on your patch (which I assume inherits the restrictions of the patch you modified)?
Logged
Private Message Reply: 10 - 11
darehanl
November 23, 2009, 8:02pm Report to Moderator
Baby Member
Posts: 2
Sorry, EG it took a while. I've received a response from the AltLinux maintainer and he says that the license of the patch is identical to the original unzip license. Have you checked with the UnZip maintainer?
Thanks.
Logged
Private Message Reply: 11 - 11
1 Pages 1 Recommend Thread
Print

Info-ZIP Discussion Forum    Info-ZIP Bugs    UnZip Bugs  ›  Unzip 6.0 is missing option -O