openLuup Hangs
-
This is a rare hang/crash of openLuup but I was able to reproduce it pretty consistently now:
I get close to house, and open homewave, it starts a poll to openLuup.
Before the poll completes, my phone drops from the VPN for one reason or another and causes openLuup to hang.
Most of the time, it is because the phone switches to wifi as I am at home. Sometimes, it is because the signal is weak and the VPN tunnel gets dropped.It order to alleviate this, I have to wait until I am on wifi before opening wifi when I get home.
@akbooer , Any way to fix this problem?
-
We’ve talked about this elsewhere before... usually related to a network problem or bridged Vera reload.
I’m pretty sure it’s somewhere in the LuaSocket library, which, TBH, doesn’t handle timeouts very well. I tried to code around this in several places, but must be missing something. Very hard to isolate the issue.
-
Indeed, I just wasn't quite sure before but I ran a couple of tests to reproduce today and it is pretty consistent. I am also confident that it is the only way openLuup has crashed for me...
It is when a json is being sent to a client asking for it but the file transfer never gets completed as the client gets disconnected. If I try to kill openLuup on linux under this situation it actually goes through a 90s timeout before killing the process.
Knowing what I know of openLuup, yeah I would think it is in the luasocket but I was wondering if there was a workaround possible. -
@rafale77 said in openLuup Hangs:
under this situation it actually goes through a 90s timeout
Now that's interesting. There is a default 90 seconds timeout on the HTTP server. It has to be greater than 60 seconds, because that is the interval in which a status request will respond in the event of no other state changes.
-
From the LuaSocket docs at http://w3.impa.br/~diego/software/luasocket/tcp.html
master:settimeout(value [, mode])
client:settimeout(value [, mode])
server:settimeout(value [, mode])
Changes the timeout values for the object. By default, all I/O operations are blocking. That is, any call to the methods send, receive, and accept will block indefinitely, until the operation completes. The settimeout method defines a limit on the amount of time the I/O methods can block. When a timeout is set and the specified amount of time has elapsed, the affected methods give up and fail with an error code.
The amount of time to wait is specified as the value parameter, in seconds. There are two timeout modes and both can be used together for fine tuning:
'b': block timeout. Specifies the upper limit on the amount of time LuaSocket can be blocked by the operating system while waiting for completion of any single I/O operation. This is the default mode;
't': total timeout. Specifies the upper limit on the amount of time LuaSocket can block a Lua script before returning from a call.The problem, if this is the solution to the problem, is that the HTTP module wrapper makes this level of control unreachable.
I had briefly looked at alternative HTTP / TCP implementation libraries, but honestly, Luasocket is so deeply embedded in Vera that this is hardly feasible.
-
I came to the same potential solution but was wondering if these timeouts could be set somewhere in the openLuup code. Not knowing which function is actually sending the user_data in response to the Homewave call, I did not know where to attempt inserting these timeouts... in the server.lua file maybe?
-
Sorry for reviving this @akbooer but I wanted to give a quick update on lua socket.
I realized that the library used by luarocks is very old (3.0 RC1 from 2013) and that the apt debian/ubuntu version is a little newer (March 2015). The library has since gone through 5 years of development and I just installed the latest github version using this command:luarocks install luasocket --server https://luarocks.org/dev
It compiles the library from the current GitHub master. I am not sure it will help but so far it has not hurt. Will test further...
-
No need to apologise... this issue needs to be fixed!
-
Well first problem after updating the lua socket:
.
I had some lua code sending an url with spaces which I did not have to url encode. Now I have to. luckily it is pretty straightforward to do with "url.escape(content)":
On the other hand... somehow I feel that the response is a lot faster... I may be dreaming and will need to test this more but homewave seems to have no lag on polling. I don't remember seeing this before.
-
Well with the communication being seemingly faster, I am now having a hard time reproducing the problem. I will post back if I am able to but so far, no hanging occurring. Even ALTUI seems to be updating its data from openLuup faster.
-
Update: I can't seem to reproduce the problem anymore in spite of trying hard even in manners which would frequently create this problem before so I think the luasocket update is potentially a fix for these openluup lockups (really lua socket lockup to be accurate).
Edit: Nope... I really can't reproduce the lock up anymore. I tried at least 20x the same scenario which would cause a lockup every 3rd time: Opening Homewave on my iPhone and have it poll openLuup while the phone is in the middle of switching between VPN through LTE and wifi. Granted I don't know whether it is because the poll completes much faster or if it is that lua no longer locks up. This may really be fixed. Can anybody else who has run into this problem confirm?
-
@akbooer, I am now fairly positive that the updated luasocket fixes the lockups. Been running for two days with full intent to lock it up trying all kinds to interruptions and I haven't been able to get it to hang. Maybe it could be part of the installation recommendations?
-
Excellent!
Have you checked that async HTTP is, in particular, OK?
I ask, because it delves deep into the LuaSocket library, so it’s not unlikely that something changed.
Does the update instruction you gave above work on a RPi? I need to give this a go!
I’ll be pleased to confirm that it’s not due to any openLuup-specific code!
-
Yes it works very well with the async http calls as well as I have a lot of my code making use of it.
The update method works, it asks luarocks to pull from the dev repo and should build and install version "scm2". It should be platform agnostic. (the command actually rebuilds the library).
The only thing I had to do is the url encode for spaces which makes use of the http_async module for which I had to change the url input by encoding the space in the url(%20).
-
Hi AK, please let me know if you have a version that works with the new lua socket. I have an other websocket problem and would be nice to see if this helps.
Cheers Rene
-
Did the update openLuup still works. Did not help for my websocket issue though. If openLuup starts it connects just fine, but if connection is lost and I close and reconnect no data is received. Need to dig some more.
-
I'm using the websocket module rigpapa created (with some fixes) https://github.com/reneboer/LuWS/blob/master/luws.lua to talk to the new Ezlo ws API. This is part of the EzloBridge i am making based on your VeraBridge as we spoke about. Just put that on GitHub https://github.com/reneboer/EzloBridge.
It is working except when the websocket gets interrupted. Also needs some better error handling etc.
-
Maybe a problem with the web socket itself? What is the source of the interruption?
As I said above, I have been desperately trying to reproduce my old problem with interrupting either the io connection or an http request response which used to make openluup or rather luasocket hang and I have no longer able to do so with the updated luasocket version.