Hylafax Mailing List Archives
|
[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
Re: [hylafax-users] faxgetty <defunct> - 'Extra' FaxGetty
Hans Strickler wrote:
4588 ? 00:00:00 faxgetty <defunct>
Several of the HylaFAX processes fork in order to allow for some kind of
action to occur as needed, such as flow-control or image decoding,
without interrupting it in order to do other things.
"<defunct>" processes are also known as "zombies", and exist because the
parent thread is still running, while the child process has exited, but
the parent thread has not yet acknowledged the child's exit and cleaned
it up. Normally this situation shouldn't give you any errors. If you do
some googling on it you'll find that a lot of software over time has
left zombie processes. Now HylaFAX intends to clean up after itself, and
it does. But in your case the existence of the zombie process may help
us figure out what's wrong.
Often we see “faxgetty <defunct>” when we arrive in the morning.
ALWAYS from the same sender. This after we ‘Grep’ “Fax” I notice that
there is a 3^rd faxgetty (which is Defunct).
We have ruled out this being line noise by switching the phone lines
between the 2 modems (ttyS14 USED to do it) in the Box. This is a
send/receive error.
Sometimes we get errors, sometimes it will not give errors. (Sender
broadcasting multiple faxes overnight). Sometimes it will happen on
the 1^st or 2^nd fax, other times after the 10^th . Simply must be a
send-receive issue. Sender SAYS nothing has changed on their end w/
their fax-server and we have not changed anything on ours, yet, this
happened kind of ‘out of the blue’
Solution so far has been to kill the faxgetty instances and re-start
them. What could be causing these?
From the rest of your e-mail you seem to be saying that faxgetty will
stop answering calls on these situations. From the log you provide that
shows a clear completion of fax protocol, but I do not see the usual
"RECV FAX" log entries at the bottom (including the one detailing the
faxrcvd command), so that gives us further clues. In fact, it points the
finger at the last two forks in faxd/FaxRecv.c++.
In FaxRecv.c++ we have three non-priority logging and
document-processing events. They are run in the background in child
processes because they're not prioritized, but the three of them must
occur in order. The three events are: notifyRecvBegun, notifyPageRecvd,
and notifyDocumentRecvd, and they must occur in that order.
In looking at that section of code again, it appears to be doing
something a bit shady, and that may be where the problem lies. Please
try the attached patch to see if it resolves things for you. If you need
a tarball or an RedHat/Fedora RPM then let me know.
I'm not exactly sure why this would only happen to you with one
particular sender, though. Maybe it's just bad luck with that sender.
Maybe that sender is doing things in a particular way that makes the
problem more likely to occur. Anyway, please try the patch.
Thanks,
Lee.
--- hylafax.orig/faxd/FaxRecv.c++ 2006-01-16 07:13:56.000000000 -0800
+++ hylafax/faxd/FaxRecv.c++ 2006-01-26 09:27:29.622047904 -0800
@@ -69,7 +69,7 @@
* If the system is busy then notifyRecvBegun may not return
* quickly. Thus we run it in a child process and move on.
*/
- waitNotifyPid = fork();
+ waitNotifyPid = fork(); // waitNotifyPid keeps the notifies ordered
switch (waitNotifyPid) {
case 0:
// NB: partially fill in info for notification call
@@ -219,11 +219,10 @@
* If syslog is busy then notifyDocumentRecvd may not return
* quickly. Thus we run it in a child process and move on.
*/
- pid_t pid = waitNotifyPid;
+ if (waitNotifyPid > 0) (void) Sys::waitpid(waitNotifyPid); // keep the notifies ordered
waitNotifyPid = fork();
switch (waitNotifyPid) {
case 0:
- if (pid > 0) (void) Sys::waitpid(pid);
notifyDocumentRecvd(info);
sleep(1); // XXX give parent time
exit(0);
@@ -284,11 +283,10 @@
* Thus we run it in a child process and move on. Timestamps
* in syslog cannot be expected to have exact precision anyway.
*/
- pid_t pid = waitNotifyPid;
+ if (waitNotifyPid > 0) (void) Sys::waitpid(waitNotifyPid); // keep the notifies ordered
waitNotifyPid = fork();
switch (waitNotifyPid) {
case 0:
- if (pid > 0) (void) Sys::waitpid(pid);
notifyPageRecvd(tif, info, ppm);
sleep(1); // XXX give parent time
exit(0);