[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [msmtp-users] Bug in msmtpq might cause mail loss



WARNING: This e-mail has been altered by MIMEDefang.  Following this
paragraph are indications of the actual changes made.  For more
information about your site's MIMEDefang policy, contact
Postmaster (MIMEDefang) <mimedefang@...276...>.  For more information about MIMEDefang, see:

            http://www.roaringpenguin.com/mimedefang/enduser.php3

------------------------------------------------------------------------------
Der Mailanhang 'testscript2.sh' wurde wurde aus Sicherheitsgruenden
umbenannt in 'defang-1.binary'. Um den alten Namen wiederherzustellen, 
rechtsklicken Sie bitte auf den Anhang und Speichern diesen
unter: testscript2.sh'
------------------------------------------------------------------------------
An attachment named 'testscript2.sh' was converted to 'defang-1.binary' for 
safety reasons. To recover the file, right-click on the attachment
and Save As 'testscript2.sh'
------------------------------------------------------------------------------


------------------------------------------------------------------------------
Der Mailanhang 'testscript3.sh' wurde wurde aus Sicherheitsgruenden
umbenannt in 'defang-2.binary'. Um den alten Namen wiederherzustellen, 
rechtsklicken Sie bitte auf den Anhang und Speichern diesen
unter: testscript3.sh'
------------------------------------------------------------------------------
An attachment named 'testscript3.sh' was converted to 'defang-2.binary' for 
safety reasons. To recover the file, right-click on the attachment
and Save As 'testscript3.sh'
------------------------------------------------------------------------------

Hi Martin,

thanks for considering the first patch.

I want to add that there is also an ugly race condition in the queue script 
coming from the way the queue is locked. I run into this error reproducibly 
when bouncing several messages simultaneously in mutt, so it is not 
esoteric. It results in errors such as

2011 26 Oct 21:07:27 : preparing to send .mail file [ /home/ph/.msmtpqueue/2011-10-26-21.07.27.mail ] but
2011 26 Oct 21:07:27 :   corresponding .msmtp file [ /home/ph/.msmtpqueue/2011-10-26-21.07.27.msmtp ] was not found in queue
2011 26 Oct 21:07:27 :   skipping this mail ; this is worth looking into
 
I've attached two scripts which illustrate the problem, modeled on [1]. Go to an 
*EMPTY* directory, copy the scripts there and run
$ echo 0 > testcounterfile; for (( i=1; i <= 10; i++ )); do ./testscript2.sh&; done
This launches 10 instances of testscript2.sh each of which has the duty to 
increment the number in testcounterfile by 1. The fact that testcounterfile 
reproducibly contains numbers strictly smaller than 10 after all instances 
of testscript2.sh have terminated shows that the locking doesn't work 
reliably.

The issue is that another instance of the script can squeeze between
    [ -e "$LOK" ]
and
    touch "$LOK"

One solution is to use mkdir and check the exit status because mkdir is 
atomic. The testscript3.sh illustrates this and with this I reliably get 
the value 10 in testcounterfile when executing
$ echo 0 > testcounterfile; for (( i=1; i <= 10; i++ )); do ./testscript3.sh&; done

I've attached a patch which implements the behavior of testscript3.sh 
msmtpq. Any thoughts?

Regards,
Philipp

[1] http://stackoverflow.com/questions/325628/race-condition-in-the-common-lock-on-file/1571711#1571711

Attachment: defang-1.binary
Description: defang-1.binary

Attachment: defang-2.binary
Description: defang-2.binary

--- msmtpq	2011-10-19 15:03:01.523338414 +0200
+++ msmtpq-my	2011-10-27 18:53:53.826788271 +0200
@@ -133,22 +133,20 @@
   local -i MAX=240 SEC=0             # max seconds to gain a lock ; seconds waiting
 
   if [ -z "$1" ] ; then              # lock queue
-    while [ -e "$LOK" ] && (( SEC < MAX )) ; do      # lock file present
+    mkdir $LOK 2>/dev/null &&  LKD='t'
+    while [ -z "$LKD" ] && (( SEC < MAX )) ; do      # lock file present
 	    sleep 1                                        # wait a second
 	    (( ++SEC ))                                    # accumulate seconds
+	    mkdir $LOK 2>/dev/null &&  LKD='t'
     done                                             # try again while locked for MAX secs
-    [ -e "$LOK" ] && \
+    [ -z "$LKD" ] && \
 	    err '' "cannot use queue $Q : waited $MAX seconds for"\
 	           "  lockfile [ $LOK ] to vanish ; giving up"\
 	           'if you are certain that no other instance of this script'\
 	           '  is running, then delete the lock file manually' '' # lock file still there, give up
 
-    touch "$LOK" && \
-      LKD='t' || \
-      log -e "$?" "couldn't create queue lock file [ $LOK ]"       # lock queue
-
   elif [ "$1" = '-u' ] ; then        # unlock queue
-    'rm' -f "$LOK"                   # remove the lock
+    'rmdir' "$LOK"                   # remove the lock
   fi
 }