I created a bug report about it a while back: <a href="http://jira.qos.ch/browse/LBCORE-168">http://jira.qos.ch/browse/LBCORE-168</a><br>I am not a windows expert by any stretch of the imagination, but in a windows cluster you can define a single owner for a shared SAN, this means if one server goes down, the other automatically takes up ownership of the drive which should be relatively transparant to other systems. However there is a 10-ish second delay in the switch and during those 10 seconds, something can go awefully wrong with the file locks in prudent mode it seems. Basically the JVM reports the log file as locked and because the lock() is blocking the thread trying to lock it will hang indefinately. The current custom appender we have written uses tryLock() multiple times with a delay and after x (configurable) amount of failures it dies gracefully.<br>
It is hard to write a testcase for this for obvious reasons, but I was able (at least back when the issue was filed) to recreate the situation repeatedly.<br><br><div class="gmail_quote">On 8 April 2011 08:58, Ceki Gulcu <span dir="ltr"><<a href="mailto:ceki@qos.ch">ceki@qos.ch</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><br>
Hi Alex,<br>
<br>
<br>
Thank you for the heads up. I was not aware of deadlocks occuring in prudent mode. Can you tell us more about your environment, OS version, file system sharing technology, etc? For example, what do you man by "ownership of the drive"?<br>
<br>
Cheers,<br>
--<br>
Ceki<br>
<br>
QOS.ch, main sponsor of cal10n, logback, mistletoe and slf4j open source<div class="im"><br>
projects, is looking to hire talented software engineers. For<br>
further details, see <a href="http://logback.qos.ch/job.html" target="_blank">http://logback.qos.ch/job.html</a><br>
<br></div><div class="im">
On 08.04.2011 07:19, Alex Vb wrote:<br>
</div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im">
Just a heads up: we had a similar usecase at a client, but unfortunatly<br>
they were using a windows cluster where one node owned the shared drive<br>
where the logs had to be written to. The problem was that when you<br>
switched ownership of the drive while the application was logging in<br>
prudent mode, the file lock could not always be released. This means the<br>
next time you tried to log, the blocking lock() call would hang<br>
infinitely. After this happened a few times (caught it before production<br>
luckily), we wrote custom appenders with non-blocking locking and now we<br>
usually log asynchronously which avoids the problem alltogether (no<br>
prudent mode needed).<br>
<br></div><div><div></div><div class="h5">
On 7 April 2011 18:44, Ceki Gulcu <<a href="mailto:ceki@qos.ch" target="_blank">ceki@qos.ch</a> <mailto:<a href="mailto:ceki@qos.ch" target="_blank">ceki@qos.ch</a>>> wrote:<br>
<br>
<br>
As David mentioned prudent mode caters for this use case. It should<br>
should work nicely.<br>
<br>
BTW, I did not see get the original message from "LogbackUser"<br>
apparently posted from nabble.<br>
<br>
--<br>
Ceki<br>
<br>
QOS.ch, main sponsor of cal10n, logback and slf4j open source<br>
projects, is looking to hire talented software engineers. For<br>
further details, see <a href="http://logback.qos.ch/job.html" target="_blank">http://logback.qos.ch/job.html</a><br>
<br>
<br>
<br>
On 07.04.2011 17:56, David Roussel wrote:<br>
<br>
<br>
Use prudent mode -<br>
<a href="http://logback.qos.ch/manual/appenders.html#FileAppender" target="_blank">http://logback.qos.ch/manual/appenders.html#FileAppender</a><br>
<br>
<br>
<br>
LogbackUser wrote:<br>
<br>
<br>
Are there any configuration properties through which multiple<br>
web-applications using logback could be configured to log to<br>
the same log<br>
file? This is taking into consideration that the messages<br>
would be logged<br>
concurrently without losing any messages.<br>
<br>
The reason I ask this question is that - we have a clustered<br>
environment<br>
of glassfish server instances where the same web application<br>
is installed<br>
on all the server instances in the cluster.<br>
<br>
<br>
<br>
_______________________________________________<br>
Logback-user mailing list<br></div></div>
<a href="mailto:Logback-user@qos.ch" target="_blank">Logback-user@qos.ch</a> <mailto:<a href="mailto:Logback-user@qos.ch" target="_blank">Logback-user@qos.ch</a>><div class="im"><br>
<a href="http://qos.ch/mailman/listinfo/logback-user" target="_blank">http://qos.ch/mailman/listinfo/logback-user</a><br>
<br>
<br>
<br>
<br>
_______________________________________________<br>
Logback-user mailing list<br>
<a href="mailto:Logback-user@qos.ch" target="_blank">Logback-user@qos.ch</a><br>
<a href="http://qos.ch/mailman/listinfo/logback-user" target="_blank">http://qos.ch/mailman/listinfo/logback-user</a><br>
</div></blockquote>
<br>
<br>
-- <br><div class="im">
QOS.ch, main sponsor of cal10n, logback and slf4j open source projects, is looking to hire talented software engineers. For further details, see <a href="http://logback.qos.ch/job.html" target="_blank">http://logback.qos.ch/job.html</a><br>
<br></div><div><div></div><div class="h5">
_______________________________________________<br>
Logback-user mailing list<br>
<a href="mailto:Logback-user@qos.ch" target="_blank">Logback-user@qos.ch</a><br>
<a href="http://qos.ch/mailman/listinfo/logback-user" target="_blank">http://qos.ch/mailman/listinfo/logback-user</a><br>
</div></div></blockquote></div><br>