problems with gawk 3.1.5-3 hanging -- more info

David Carter carter@pondol.com
Fri Mar 31 01:42:00 GMT 2006


Igor Peshansky wrote:
> On Thu, 30 Mar 2006, David Carter wrote:
>> It appears to me that by opening the file as O_TEXT, that gawk is
>> hanging because it is waiting for that LF char to follow the CR (which
>> never comes). Does this sound likely to you?
> 
> If this theory were true, "echo -ne 'aa\rb' | gawk '{print $0}'" would
> hang.  It doesn't for me, even with textmode pipes...

Yes, I realized this myself soon after posting. Your echo command 
doesn't hang for me either. As I said in my original post, this is one 
of those annoying bugs that if I try to make it hang interactively, it 
always works correctly (never hangs), but if I try to do it with my 
regular script, it (usually, but not always) hangs.  This is another 
clue that my initial "theory" was incorrect: if it were true, the 
program would hang regardless.

Here's an example line, callable from a prompt, that usually hangs:

$ rsync -Pv sourcefile rmachine:/rpath/ | \
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

To test this, I recommend using a source/remote combination for rsync 
that will take about 30 seconds to a minute to complete. This will 
create enough output for gawk to replicate the issue.

If this hangs (it may not hang the first time; give it 2 or 3 runs), 
you'll stop getting output to stdout and it will just sit there. If you 
go to another prompt to do a ps, you'll see that rsync is done running 
but gawk is still sitting there. CTRL+C in the window running the script 
does nothing. You need to kill the gawk process from another bash prompt.

> Try saving the output of rsync to file and running gawk over that
> separately...  

Good idea. Per your advice, I tried doing something like the following:

$ rsync -Pv sourcefile rmachine:/rpath/ > rsync.out
$ cat rsync.out | \
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

Surprisingly, that code never hangs. Also, this never hangs:

$ rsync -Pv sourcefile rmachine:/rpath/ | xxd | xxd -r | \
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

However, this usually hangs:

$ rsync -Pv sourcefile rmachine:/rpath/ | cat |
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

> Also, if gawk really hangs, you can run it under strace to
> see exactly what it was doing up to the hang (but please don't post the
> strace output unless you're asked to do so by Corinna or CGF).

I tried something like the following:

$ rsync -Pv sourcefile rmachine:/rpath/ | strace \
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

But, unfortunately, this never hangs. So I tried this:

$ ( sleep 10; rsync -Pv sourcefile rmachine:/rpath/ ) | \
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

and then I go to another window and start strace on the gawk PID. This 
hangs (usually). Looking at the strace output, the last thing gawk does is:

    87 22612601 [read_pipe] gawk 188 fhandler_base::read: returning 1, 
text mode

Every time it hangs, I get "read returning 1, text mode". If I look at 
strace output for the sucessful (non-hanging) executions, i never get a 
"read returning 1, text mode."

All of this makes me wonder if:
   a) rsync is perhaps doing something with its stdout file descriptor 
that it shouldn't be doing, or that;
   b) gawk is perhaps doing something with its stdin file descriptor 
that it shouldn't be doing.

If a), then why doesn't it break when I just redirect the output of 
rsync to a file? If b), then what is it about piping the output of rsync 
to gawk that is different (from gawk's point of view) than when I just 
save the rsync output to a file and then send the contents of the file 
to gawk?

And another thing...why would any of this make any difference if gawk 
opens the file as O_TEXT vs O_BINARY?

> HTH,

It was a great help. Thanks, Igor. Any other light you can shed is much 
appreciated.

Regards;

David Carter

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/



More information about the Cygwin mailing list