mercredi 25 mars 2015

How exactly does perl's -0 option work?


According to man perlrun:



-0[octal/hexadecimal]
specifies the input record separator ($/) as an octal or
hexadecimal number. If there are no digits, the null character is
the separator.


and



The special value 00 will cause Perl to slurp files in paragraph
mode. Any value 0400 or above will cause Perl to slurp files
whole, but by convention the value 0777 is the one normally used
for this purpose.


However, given this input file:



This is paragraph one

This is paragraph two.


I get some unexpected results:



$ perl -0ne 'print; exit' file ## \0 is used, so everything is printed
This is paragraph one.

This is paragraph two.

$ perl -00ne 'print; exit' file ## Paragraph mode, as expected
This is paragraph one.


So far, so good. Now, why do these two seem to also work in paragraph mode?



$ perl -000ne 'print; exit' file
This is paragraph one.

$ perl -0000ne 'print; exit' file
This is paragraph one.


And why is this one apparently slurping the entire file again?



$ perl -00000ne 'print; exit' file
This is paragraph one.

This is paragraph two.


Further testing shows that these all seem to work in paragraph mode:



perl -000
perl -0000
perl -000000
perl -0000000
perl -00000000


While these seem to slurp the file whole:



perl -00000
perl -000000000


I guess my problem is that I don't understand octal well enough (at all, really), I am a biologist, not a programmer. Do the latter two slurp the file whole because both 0000 and 00000000 are >= 0400? Or is there something completely different going on?



Aucun commentaire:

Enregistrer un commentaire