I have the following yaml configuration in local_mqtt_devices file
x_mqtt_device: set_speed: arguments: speed: type: str topic: "command/%friendly_name%" payload: type: json expr: '{ "fan": parameters.speed }'While this works fine, I'm wondering how this could be changed to "fixed" parameters, as in this case "fan" only accepts "A", "Q" or a numeric value of 1-5?
Hi!
I get this message when I'm on the status tab:
System Configuration Check
The time on this system and on the Reactor host are significantly different. This may be due to incorrect system configuration on either or both. Please check the configuration of both systems. The host reports 2025-04-01T15:29:29.252Z; browser reports 2025-04-01T15:29:40.528Z; difference 11.276 seconds.
I have MSR installed as a docker on my Home Assistant Blue / Hardkernel ODROID-N2/N2+. MSR version is latest-25082-3c348de6.
HA versions are:
Core 2025.3.4
Supervisor 2025.03.4
Operating System 15.1
I have restarted HA as well as MSR multiple times. This message didn´t show two weeks ago. Don´t know if it have anything to do with the latest MSR version.
Do anyone know what I can try?
Thanks in advance!
Let's Be Careful Out There (Hill Street reference...) 🙂
/Fanan
I have a very strange situation, where if InfluxDB restarts, other containers may fail when restarting at the same time (under not easy to understand circumstances), and InfluxDB remains unreachable (and these containers crashes). I need to reboot these containers in an exact order, after rebooting InfluxDB.
While I understand what's going on, I need a way to reliable determine that InfluxDB is not reachable and these containers are not reachable, in order to identify this situation and manually check what's going on - and, maybe, in the future, automatically restart them if needed.
So, I was looking at HTTP Request action, but I need to capture the HTTP response code, instead of the response (becase if ping is OK, InfluxDB will reply with a 204), and, potentially, a way to programmatically detect that it's failing to get the response.
While I could write a custom HTTP controller for this or a custom HTTP virtual device, I was wondering if this is somewhat on you roadmap @toggledbits
Thanks!
Hi ,
I'm on
-Reactor (Multi-hub) latest-25067-62e21a2d
-Docker on Synology NAS
-ZWaveJSUI 9.31.0.6c80945
Problem with ZwaveJSUI:
When I try to change color to a bulb RGBWW, it doesn't change to the RGB color and the bulb remains warm or cold white.
I tryed with Zipato RGBW Bulb V2 RGBWE2, Hank Bulb HKZW-RGB01, Aentec 6 A-ZWA002, so seems that it happens with all RGBWW bulb with reactor/zwavejsui.
I'm using from reator the entity action: "rgb_color.set" and "rgb_color.set_rgb".
After I send the reactor command, It changes in zwavejsui the rgb settings but doesn't put the white channel to "0", so the prevalent channel remains warm/cold White and the bulb doesn't change into the rgb color.
This is the status of the bulb in zwavejsui after "rgb_color.set" (235,33,33,) and the bulb is still warmWhite.
x_zwave_values.Color_Switch_currentColor={"warmWhite":204,"coldWhite":0,"red":235,"green":33,"blue":33}The "cold white" and "warm white" settings interfer with the rgb color settings.
Reactor can change bulb colors with rgb_color set — (value, ui8, 0x000000 to 0xffffff) or rgb_color set_rgb — (red, green, blue, all ui1, 0 to 255) but if warm or cold white
are not to "0", zwavejsui doesn't change them and I can't find a way to change into rgb or from rgb back to warm white.
So if I use from reactor: rgb_color set_rgb — (235,33,33) in zwavejsui I have
x_zwave_values.Color_Switch_targetColor={"red":235,"green":33,"blue":33} 14/03/2025, 16:43:57 - value updated Arg 0: └─commandClassName: Color Switch └─commandClass: 51 └─property: targetColor └─endpoint: 0 └─newValue └──red: 235 └──green: 33 └──blue: 33 └─prevValue └──red: 235 └──green: 33 └──blue: 33 └─propertyName: targetColor 14/03/2025, 16:43:57 - value updated Arg 0: └─commandClassName: Color Switch └─commandClass: 51 └─property: currentColor └─endpoint: 0 └─newValue └──warmWhite: 204 └──coldWhite: 0 └──red: 235 └──green: 33 └──blue: 33 └─prevValue └──warmWhite: 204 └──coldWhite: 0 └──red: 235 └──green: 33 └──blue: 33 └─propertyName: currentColorIn zwavejsui, the bulb changes rgb set but warm White remains to "204" and the bulb remais on warm White channel bacause is prevalent on rgb set.
x_zwave_values.Color_Switch_currentColor_0=204 x_zwave_values.Color_Switch_currentColor_1=0 x_zwave_values.Color_Switch_currentColor_2=235 x_zwave_values.Color_Switch_currentColor_3=33 x_zwave_values.Color_Switch_currentColor_4=33Is it possible to targetColor also for "warmWhite" and "coldWhite" and have something similar to this?
x_zwave_values.Color_Switch_targetColor={"warmWhite":0,"coldWhite":0,"red":235,"green":33,"blue":33}Thanks in advance.
Good day all,
I have a reaction set up, that I use for both troubleshooting and changing home modes when one of my family members either arrive or are leaving. I use the companion app for HAAS on our iPhones, and HAAS reports if the person associated with the iPhone enters or leaves the geofenced area around my home. I'm sure most MSR and HAAS users are familiar with this.
I use this rule set mainly as a condition for other rules, however, as part of troubleshooting, a notification is sent through HAAS to the companion app when the rule becomes true. The problem is that I'm getting notifications now for both arriving and departing simultaneously.
96b3f7db-ba09-499e-a78c-86903b603857-image.png
36903cdd-a87f-473b-82ef-af9ef96d3c44-image.png It used to work fine as intended. I'm not sure exactly when it changed, but now I'm getting two notifications when either of these conditions change.
Any idea what could be happening?
Edit:
Running: latest-25082-3c348de6, bare-metal Linux
ZWaveJSControllerr [0.1.25082]
MSR had been running fine, but I decided to follow the message to upgrade to 25067. Since the upgrade, I have received the message "Controller "<name>" (HubitatController hubitat2) could not be loaded at startup. Its ID is not unique." MSR throws the message on every restart. Has anyone else encountered this problem?
I am running MSR on a Raspberry Pi4 connecting to two Hubitat units over an OpenVPN tunnel. One C8 and a C8 Pro. Both are up-to-date. It appears that despite the error message that MSR may be operating properly.
Build 21228 has been released. Docker images available from DockerHub as usual, and bare-metal packages here.
Home Assistant up to version 2021.8.6 supported; the online version of the manual will now state the current supported versions; Fix an error in OWMWeatherController that could cause it to stop updating; Unify the approach to entity filtering on all hub interface classes (controllers); this works for device entities only; it may be extended to other entities later; Improve error detail in messages for EzloController during auth phase; Add isRuleSet() and isRuleEnabled() functions to expressions extensions; Implement set action for lock and passage capabilities (makes them more easily scriptable in some cases); Fix a place in the UI where 24-hour time was not being displayed.Similarly as for local expressions, global expressions evaluate and update fine when getEntity(...) structure is used. However, at least when certain functions are in use, expressions do not update.
Consider the following test case:
Screenshot 2025-03-13 at 16.29.42.png
Even though auto-evaluation is active, value does not change (it changes only if that expression is manually run). MSR restarts do not help.
Screenshot 2025-03-13 at 16.31.43.png
Note: Tested using build 25067 on Docker. I have also a PR open (but couldn't now get details or PR number as my Mantis account was somehow expired?).
Trying to understand what cause a local expresssion to be evaluated. I have read the manual but I am still not clear about it. Using the test rule below, I can see in the log that the rule is being automatically evaluated every time the temperature entity is changing. That is great...
What I am trying to understand is why the expression is not evaluated based on time as well since the "case" statement has time dependencies.
Any help would be appreciated
I have the following test rule:
eba6a3ea-ff61-4610-88c9-9b9864f11ff8-Screenshot 2025-01-21 095244.png
2d9c1ff5-7b73-4005-b324-9029c2709db9-Screenshot 2025-01-21 095302.png
Here is the expressioncode:
vFrom1 = "09:25:00", vFrom2 = "09:30:00", vFrom3 = "09:41:00", vTo = "10:55:00", # Get current time (format HH:MM:SS) vToDay = strftime("%H:%M:%S"), #Get current house temperature CurrentHouseTemp = getEntity( "hass>Thermostat2 " ).attributes.temperature_sensor.value, case when CurrentHouseTemp <= 19 and vToDay >= vFrom1 && vToDay <= vTo: "true1" # From1 when CurrentHouseTemp <= 20 and vToDay >= vFrom2 && vToDay <= vTo: "true2" # From2 when CurrentHouseTemp < 26 and vToDay >= vFrom3 && vToDay <= vTo: "true3" # From3 else "false" endI am getting a Runtime error on different browsers when I click exit when editing an existing or creating a new global reaction containing a group. If the global reaction does not have a group I don't get an error. I see a similar post on the forum about a Runtime Error when creating reactions but started a new thread as that appears to be solved.
The Runtime Error is different in the two browsers
Safari v18.3
Google Chrome 133.0.6943.142
TypeError: self.editor.isModified is not a function at HTMLButtonElement.<anonymous> (http://192.168.10.21:8111/reactor/en-US/lib/js/reaction-list.js:171:34) You may report this error, but do not screen shot it. Copy-paste the complete text. Remember to include a description of the operation you were performing in as much detail as possible. Report using the Reactor Bug Tracker (in your left navigation) or at the SmartHome Community.Steps to reproduce:
Click the pencil to edit a global reaction with a group.
Click the Exit button.
Runtime error appears.
or
Click Create Reaction
Click Add Action
Select Group
Add Condition such as Entity Attribute.
Add an Action.
Click Save
Click Exit
Runtime error appears.
I don’t know how long the error has been there as I haven’t edited the global reaction in a long time.
Reactor (Multi-hub) latest-25060-f32eaa46
Docker
Mac OS: 15.3.1
Thanks
I am trying to delete a global expression (gLightDelay) but for some strange reason, it comes back despite clicking the Delete this expression and Save Changes buttons.
I have not created a global expression for some times and just noticed this while doing some clean-up.
I have upgraded Reactor to 25067 from 25060 and the behaviour is still there. I have restarted Reactor (as well as restarting its container) and cleared the browser's cache several times without success.
Here's what the log shows.
[latest-25067]2025-03-08T23:50:22.690Z <wsapi:INFO> [WSAPI]wsapi#1 rpc_echo [Object]{ "comment": "UI activity" } [latest-25067]2025-03-08T23:50:26.254Z <GlobalExpression:NOTICE> Deleting global expression gLightDelay [latest-25067]2025-03-08T23:50:27.887Z <wsapi:INFO> [WSAPI]wsapi#1 rpc_echo [Object]{ "comment": "UI activity" }Reactor latest-25067-62e21a2d
Docker on Synology NAS
Morning, experts. Hard on learning about the internet check script in MSR tools, I was wondering what suggestions anyone has about a local (i.e. non-internet dependent) notification method.
This was prompted by yesterday's fun and games with my ISP.
I've got the script Cronned and working properly but short of flashing a light on and off, I'm struggling to think of a way of alerting me (ideally to my phone)
I guess I could set up a Discord server at home, but that feels like overkill for a rare occasion. Any other suggestions?
TIA
C
Hi,
I'm trying to integrate the sonos-mqtt (https://sonos2mqtt.svrooij.io/) with the MSR and it's coming along nicely so far.
But cannot wrap my head around how to define custom capabilities in MQTT templates. I need this for the TTS announcements and similarly for the notification sounds where I would pass the sound file as parameter.
So this is what I have in the local_mqtt_devices.yaml
capabilities: x_sonos_announcement: attributes: actions: speak: arguments: text: type: string volume: type: int delay: type: intAnd this is the template:
templates: sonos-announcement: capabilities: - x_sonos_announcement actions: x_sonos_announcement: speak: topic: "sonos/cmd/speak" payload: expr: > { "text": parameters.text, "volume": parameters.volume, "delayMs": parameters.delay, "onlyWhenPlaying": false, "engine": "neural" } type: jsonSo the speak action should send something like this to topic sonos/cmd/speak
{ "text": "message goes here", "volume": 50, "delayMs": 100, "onlyWhenPlaying": false, "engine": "neural" }At startup the MSR seems to be quite unhappy with my configuration:
reactor | [latest-25016]2025-02-09T08:19:59.029Z <MQTTController:WARN> MQTTController#mqtt entity Entity#mqtt>sonos-announcement unable to configure capabilities [Array][ "x_sonos_announcement" ] reactor | i18n: missing fi-FI language string: Configuration for {0:q} is incomplete because the following requested capabilities are undefined: {1} reactor | i18n: missing fi-FI language string: Configuration for {0:q} has unrecognized capability {1:q} in actions reactor | Trace: Configuration for {0:q} is incomplete because the following requested capabilities are undefined: {1} reactor | at _T (/opt/reactor/server/lib/i18n.js:611:28) reactor | at AlertManager.addAlert (/opt/reactor/server/lib/AlertManager.js:125:25) reactor | at MQTTController.sendWarning (/opt/reactor/server/lib/Controller.js:627:30) reactor | at MQTTController.start (/var/reactor/ext/MQTTController/MQTTController.js:268:26) reactor | at async Promise.allSettled (index 0) Configuration for "sonos-announcement" has unrecognized capability "x_sonos_announcement" in actions Controller: MQTTController#mqtt Last 10:21:37 AM Configuration for "sonos-announcement" is incomplete because the following requested capabilities are undefined: x_sonos_announcement Controller: MQTTController#mqtt Last 10:21:37 AMThis is probably a pretty stupid question and the approach may not even work at all, but maybe someone or @toggledbits for sure, could point me to the right direction.
Basically the idea is to be able to send TTS messages from reactions using entity actions. I've previously used HTTP requests to Sonos HTTP API (https://hub.docker.com/r/chrisns/docker-node-sonos-http-api/) for the same functionality, but since moving to sonos-mqtt, I need a way to send the TTS notifications using MQTTController. Along with the actual message, volume and delay must also be parameterizable.
br,
mgvra
MSR latest-25016-d47fea38 / MQTTController [0.2.24293]
Hi, @toggledbits
I just noticed that following a reboot of my raspberry pi, some of the rules, that I was expecting to recover, are not catching up following a reboot. I have made a simple test rule (rule-m6rz6ol1) with only "after Date/time" as trigger and "turn on a lamp" as a set reaction. All my infrastructure is on the same board so Reactor, Hass, Zwavejs, ... are all rebooting.
Here is the sequence of the test case (All time converted to Zulu to match logs):
Rule "after Date/Time" set to 14:05:00z Shutdown on Raspberry Pi at 14:04:00z Power back up at 14:08:00z Rule overview shows true as of 14:08:14z waiting for 00:00:00 in GUIFrom the log I can see that MSR is picking up the rule and knows that the state of the rule has changed from false to true and tries to send the update to HASS but failed with websocket error.
Here is what I see from the log:
14:04:04z shutdown complete 14:08:08z Power up 14:08:13.111z websocket connection 14:08:15:323z Reaction to the light failed, Websocket not opened After there is a series of websocket connection attempt until 14:08:51z where it seemed to be really ready.Back in 2021 we had a discussion (https://smarthome.community/topic/700/solved-start-up?_=1738766986566) and you proposed to add a startup_delay:xxxx and startup_wait:xxxx parameter in the engine section of "reactor.yaml". When I try the startup_delay (this used to be a hard delay), the engine failed to start (I think). I then try the startup_wait:xxxx without any success. Since it wait for the connection status to be up to cancel the delay, it does not do anyting since Hass is reporting the socket up without really being up ( I think...).
Questions:
Did I figured it all wrong? should the startup_delay:xxxxx have worked? Any ideas?Here is the log:
OK now I am stuck. I did add the log but when I submit the editor complained saying that I am limited to 32767 characters. The log from the shutdown to the time the websocket is stable is about 300000 character long. What are my options?
Not a big issue simply a request if easily doable.
The MSR logs files inside the container are owned by root witch is fine however, the permissions are very restrictive. I do not know if there is something wrong with my installation but the logs permission are set to 222 (write only). Even if the docker volume is set for Read/Write the log files are retaining these values.
I go around the problem by doing a chmod 777 on all reactor logs but every time there is an MSR log rotation the permissions are set back to 222. So unless the permission are implemented in the container there is no permanent solution to this (that I know of).
I do not know much about Docker container so I do not know what is involved here.
Can the logfiles permission be simply chaged in the container to at least allow "other" read permission?
Could the MSR log rotation routine implement a chmod to set the permission?
Just a small anoyance
Thanks
@toggledbits In the MSR documentation, under Standard Capabilities, I noticed that the
button.since attribute was deprecated as of version 22256 and the metadata is the preferred way to access the last-modified time of an attribute.
Am I reading this right? Should I stop using it in my rules?
Thanks
When on my bare metal RPi with MSR I had a rule that ran every minute to check Internet status via a script in MSR called reactor_inet_check.sh
I've moved to containerized MSR and see in the instructions that this cannot be run from the container.
The script cannot run within the Reactor docker container. If you are using Reactor in a docker container, the script needs to be run by cron or an equivalent facility on the host system (e.g. some systems, like Synology NAS, have separate task managers that may be used to schedule the repeated execution of tasks such as this).
I've put a script on my container host that calls the reator_inet_check.sh script and it isn't erroring... but I still see the internet status within MSR as null.
Before I go diving down the rabbit hole... should this work?
My cronjob on the proxmox host:
909fe6f0-77fd-4734-80a4-c9e354c910b6-image.png
The contents of msr_internet_check_caller.sh
16337528-cf31-4968-bffe-af1149f7103e-image.png
Background: this is a Windows MSR install I've done for our local pool/amenity center just to run some fans and lights (not my daily driver at home). Install went perfectly fine.
Scenario: I want the lights to go on when it's dark enough (even if during a storm, not just after sunset) so I'm using solarRadiation from my weather station to drive that Trigger. Easy stuff.
Issue: sometimes, someone goes in the office and just starts flipping switches and the result can be lights turned on in the daytime or off at night. I'm trying to create a "catch-all" wherein if it is daytime and the lights somehow find their way ON, they will turn themselves back OFF.
I have the following Reaction built:
b30eab5b-5a14-4a3a-8c9a-47e3e7e53dc3-image.png
I also have this Reaction for opposite, ie the lights find themselves turned off after dark and they will turn themselves back on:
5c6946b1-297c-4eb1-9618-74820979df29-image.png
Here are my two rules:
288cba86-f941-4157-86d9-d8e7487905f7-image.png *NOTE that in my manual testing, ie I turn on the light switch at the incorrect time, when the solarRadiation level changes the Lights ON rule flags and shows as SET. On the next change of solarRadiation it goes back to reset again.
My expectation is that Lights OFF rule should see the lights are on, the solarRadiation is above the set limit, and turn them off. Instead, every other run, the ON rule moves to SET and then reset again on the following run.
Logs appear angry:
[latest-25016]2025-01-26T22:03:31.696Z <Engine:INFO> Enqueueing "Lights ON<RESET>" (rule-m6e4ajh7:R) [latest-25016]2025-01-26T22:03:31.712Z <Engine:NOTICE> Starting reaction Lights ON<RESET> (rule-m6e4ajh7:R) [latest-25016]2025-01-26T22:03:31.713Z <Engine:INFO> Lights ON<RESET> all actions completed. [latest-25016]2025-01-26T22:03:42.565Z <wsapi:INFO> client "127.0.0.1#3" closed, code=1001, reason= [latest-25016]2025-01-26T22:03:42.753Z <httpapi:INFO> [HTTPAPI]#1 API request from ::ffff:127.0.0.1: GET /api/v1/lang [latest-25016]2025-01-26T22:03:42.754Z <httpapi:INFO> [HTTPAPI]#1 request for /api/v1/lang from ::ffff:127.0.0.1 user anonymous auth none matches /api/v1/lang ACL (#7): [Object]{ "url": "/api/v1/lang", "allow": true, "index": 7 } [latest-25016]2025-01-26T22:03:42.790Z <wsapi:INFO> wsapi: connection from ::ffff:127.0.0.1 [latest-25016]2025-01-26T22:03:42.839Z <wsapi:INFO> [WSAPI]wsapi#1 client "127.0.0.1#6" authorized [latest-25016]2025-01-26T22:03:43.353Z <httpapi:INFO> [HTTPAPI]#1 API request from ::ffff:127.0.0.1: GET /api/v1/systime [latest-25016]2025-01-26T22:03:43.353Z <httpapi:INFO> [HTTPAPI]#1 request for /api/v1/systime from ::ffff:127.0.0.1 user anonymous auth none matches /api/v1/systime ACL (#5): [Object]{ "url": "/api/v1/systime", "allow": true, "index": 5 } [latest-25016]2025-01-26T22:03:48.146Z <wsapi:INFO> client "127.0.0.1#6" closed, code=1001, reason= [latest-25016]2025-01-26T22:03:48.308Z <httpapi:INFO> [HTTPAPI]#1 API request from ::ffff:127.0.0.1: GET /api/v1/lang [latest-25016]2025-01-26T22:03:48.309Z <httpapi:INFO> [HTTPAPI]#1 request for /api/v1/lang from ::ffff:127.0.0.1 user anonymous auth none matches /api/v1/lang ACL (#7): [Object]{ "url": "/api/v1/lang", "allow": true, "index": 7 } [latest-25016]2025-01-26T22:03:48.346Z <wsapi:INFO> wsapi: connection from ::ffff:127.0.0.1 [latest-25016]2025-01-26T22:03:48.390Z <wsapi:INFO> [WSAPI]wsapi#1 client "127.0.0.1#7" authorized [latest-25016]2025-01-26T22:03:49.412Z <httpapi:INFO> [HTTPAPI]#1 API request from ::ffff:127.0.0.1: GET /api/v1/systime [latest-25016]2025-01-26T22:03:49.413Z <httpapi:INFO> [HTTPAPI]#1 request for /api/v1/systime from ::ffff:127.0.0.1 user anonymous auth none matches /api/v1/systime ACL (#5): [Object]{ "url": "/api/v1/systime", "allow": true, "index": 5 } [latest-25016]2025-01-26T22:03:52.734Z <wsapi:INFO> client "127.0.0.1#7" closed, code=1001, reason= [latest-25016]2025-01-26T22:03:52.891Z <httpapi:INFO> [HTTPAPI]#1 API request from ::ffff:127.0.0.1: GET /api/v1/lang [latest-25016]2025-01-26T22:03:52.892Z <httpapi:INFO> [HTTPAPI]#1 request for /api/v1/lang from ::ffff:127.0.0.1 user anonymous auth none matches /api/v1/lang ACL (#7): [Object]{ "url": "/api/v1/lang", "allow": true, "index": 7 } [latest-25016]2025-01-26T22:03:52.925Z <wsapi:INFO> wsapi: connection from ::ffff:127.0.0.1 [latest-25016]2025-01-26T22:03:52.965Z <wsapi:INFO> [WSAPI]wsapi#1 client "127.0.0.1#8" authorized [latest-25016]2025-01-26T22:03:54.383Z <httpapi:INFO> [HTTPAPI]#1 API request from ::ffff:127.0.0.1: GET /api/v1/systime [latest-25016]2025-01-26T22:03:54.384Z <httpapi:INFO> [HTTPAPI]#1 request for /api/v1/systime from ::ffff:127.0.0.1 user anonymous auth none matches /api/v1/systime ACL (#5): [Object]{ "url": "/api/v1/systime", "allow": true, "index": 5 } [latest-25016]2025-01-26T22:04:01.590Z <wsapi:INFO> [WSAPI]wsapi#1 rpc_echo [Object]{ "comment": "UI activity" } [latest-25016]2025-01-26T22:04:39.646Z <Rule:INFO> Lights OFF (rule-m6e33ja3 in Atrium Lights) evaluated; rule state transition from RESET to SET! [latest-25016]2025-01-26T22:04:39.656Z <Rule:INFO> Lights ON (rule-m6e4ajh7 in Atrium Lights) evaluated; rule state transition from RESET to SET! [latest-25016]2025-01-26T22:04:39.663Z <Engine:INFO> Enqueueing "Lights OFF<SET>" (rule-m6e33ja3:S) [latest-25016]2025-01-26T22:04:39.665Z <Engine:INFO> Enqueueing "Lights ON<SET>" (rule-m6e4ajh7:S) [latest-25016]2025-01-26T22:04:39.668Z <Engine:NOTICE> Starting reaction Lights OFF<SET> (rule-m6e33ja3:S) [latest-25016]2025-01-26T22:04:39.669Z <Engine:NOTICE> Starting reaction Lights ON<SET> (rule-m6e4ajh7:S) [latest-25016]2025-01-26T22:04:39.669Z <Engine:INFO> Lights ON<SET> all actions completed. [latest-25016]2025-01-26T22:04:39.675Z <Rule:INFO> Lights OFF (rule-m6e33ja3 in Atrium Lights) evaluated; rule state transition from SET to RESET! [latest-25016]2025-01-26T22:04:39.680Z <Engine:NOTICE> ReactionHistory: no entry for [latest-25016]2025-01-26T22:04:39.683Z <Engine:NOTICE> [Engine]Engine#1 entry 256 reaction rule-m6e33ja3:S-1q2f1j0p: [Error] terminated [parent terminating] [latest-25016]2025-01-26T22:04:39.683Z <Engine:CRIT> Error: terminated [parent terminating] Error: terminated at Engine._process_reaction_queue (C:\Users\Jalan\msr\reactor\server\lib\Engine.js:1644:47) [latest-25016]2025-01-26T22:04:39.699Z <Engine:NOTICE> [Engine]Engine#1 entry 254 reaction rule-m6e33ja3:S: [Error] terminated [preempted by rule state change] [latest-25016]2025-01-26T22:04:39.699Z <Engine:CRIT> Error: terminated [preempted by rule state change] Error: terminated at Engine._process_reaction_queue (C:\Users\Jalan\msr\reactor\server\lib\Engine.js:1644:47) [latest-25016]2025-01-26T22:04:39.700Z <Engine:INFO> Enqueueing "Lights OFF<RESET>" (rule-m6e33ja3:R) [latest-25016]2025-01-26T22:04:39.704Z <Engine:NOTICE> Starting reaction Lights OFF<RESET> (rule-m6e33ja3:R) [latest-25016]2025-01-26T22:04:39.705Z <Engine:INFO> Lights OFF<RESET> all actions completed. [latest-25016]2025-01-26T22:05:48.822Z <Rule:INFO> Lights ON (rule-m6e4ajh7 in Atrium Lights) evaluated; rule state transition from SET to RESET! [latest-25016]2025-01-26T22:05:48.831Z <Engine:INFO> Enqueueing "Lights ON<RESET>" (rule-m6e4ajh7:R) [latest-25016]2025-01-26T22:05:48.847Z <Engine:NOTICE> Starting reaction Lights ON<RESET> (rule-m6e4ajh7:R) [latest-25016]2025-01-26T22:05:48.847Z <Engine:INFO> Lights ON<RESET> all actions completed.Hi @toggledbits
I found this very old post that talked about a way to limit device reading to avoid the throttled problem, because it's not a question of logic, it's that the device actually sends a lot of information, in my case the NUT ups installed in HE.
https://smarthome.community/topic/687/flapping-device?_=1737652139854
It mentions engine section of reactor.yaml by setting update_rate_limit, but I looked in the current MSR documentation and I can't find this information, so I don't know if it's still valid, its effect and parameters.
My situation is simple, when I have a UPS problem the NUT is sending dozens of reports per second and then I have the throttled problem. The same rule applies when the power is normal.
This is the rule, and the parameter that fails is the Tripp Lite UPS status.
cf9ddabf-3144-4e5a-80a4-0dc7664b9573-image.png
a813a077-974e-4737-897c-e383085b3d8f-image.png
All error is the same scenario.
[latest-25016]2025-01-23T12:01:32.753Z <Rule:WARN> (13) NUT Disconected (rule-l4djr0p7 in Warning) update rate 121/min exceeds limit (120/min)! Logic loop? Throttl> [latest-25016]2025-01-23T12:01:32.756Z <Rule:WARN> (27) Falta de Energia (rule-l4h9ceod in Warning) update rate 121/min exceeds limit (120/min)! Logic loop? Thrott> [latest-25016]2025-01-23T12:01:32.769Z <Rule:WARN> (73) UPS Battery Low (rule-l4hj850o in Warning) update rate 121/min exceeds limit (120/min)! Logic loop? Throttl> [latest-25016]2025-01-23T12:01:32.772Z <Rule:WARN> (74) UPS Comm Fail (rule-l4kbs5cp in Warning) update rate 121/min exceeds limit (120/min)! Logic loop? Throttlin> [latest-25016]2025-01-23T12:01:32.776Z <Rule:WARN> (76) UPS Utility Back (rule-l4hjhs6m in Warning) update rate 121/min exceeds limit (120/min)! Logic loop? Thrott> [latest-25016]2025-01-23T12:01:32.780Z <Rule:WARN> UPS On Battery (rule-l4hjuka5 in Datacenter) update rate 121/min exceeds limit (120/min)! Logic loop? Throttling> [latest-25016]2025-01-23T12:01:32.781Z <Rule:WARN> UPS Info (rule-l4gheo63 in Datacenter) update rate 121/min exceeds limit (120/min)! Logic loop? Throttling... [latest-25016]2025-01-23T12:01:40.757Z <Rule:WARN> (13) NUT Disconected (rule-l4djr0p7 in Warning) update rate 121/min exceeds limit (120/min)! Logic loop? Throttl> [latest-25016]2025-01-23T12:01:40.759Z <Rule:WARN> (27) Falta de Energia (rule-l4h9ceod in Warning) update rate 121/min exceeds limit (120/min)! Logic loop? Thrott> [latest-25016]2025-01-23T12:01:40.776Z <Rule:WARN> (73) UPS Battery Low (rule-l4hj850o in Warning) update rate 121/min exceeds limit (120/min)! Logic loop? Throttl> [latest-25016]2025-01-23T12:01:40.777Z <Rule:WARN> (74) UPS Comm Fail (rule-l4kbs5cp in Warning) update rate 121/min exceeds limit (120/min)! Logic loop? Throttlin> [latest-25016]2025-01-23T12:01:40.778Z <Rule:WARN> (76) UPS Utility Back (rule-l4hjhs6m in Warning) update rate 121/min exceeds limit (120/min)! Logic loop? Thrott>Thanks.
Hello -
Long time. Hope everyone is good.
I have a rule that looks at a number of temperature sensors around the house. It simply sends a general alert if any of them fall below their threshold. (A basic “House is too cold” alert for when we’re away)
Generally, this has worked well. But I was wondering if there’s a way to make the message somewhat dynamic without creating separate rules for each sensor.
E.g. “House is too cold due to Sump Temperature below 45 degrees.”
I thought I remember reading about someone doing this in the past but couldn’t find it.
Thanks for any ideas!
[Solved] Is there a cap or max number of devices a Global Reaction should not exceed?
-
@wmarcolin Very good. That means the more aggressive checks are working. It appears that Hubitat's event socket is a good bit more fragile than its Hass equal. I will add an option to the next release to silence this warning (although the reconnect will still be logged in the log file). You should also be able to see the effect of the reconnects on the system entity's
x_hubitat_sys.reconnects
counter.For comparison, I don't get these errors unless I force them. The one restart shown below was because I upgraded the hub to 2.3.0.120.
-
TL;DR: Hubitat needs more aggressive WebSocket connection health tests and recovery, and that's been added as of 21351. When recovery is needed (which should be very rare), device states may be delayed up to 120 seconds. Don't use WiFi for your Reactor host or hub in production. If your Reactor host and hub aren't on the same network segment (LAN), you may see more reconnects. If you see reconnects when your Reactor host and hub are on the same network segment, you likely have a network quality issue.
I want to explain how I understand the problem reported, and how the fix (which is more of a workaround) works.
The events websocket is one of two channels used by HubitatController to get data from the hub. When HubitatController connects to the hub, it begins with a query to MakerAPI to fetch the bulk data for all devices -- "give me everything". Thereafter, the events socket provides (only) updates (changes). Being a WebSocket connection, it has a standard-required implementation of ping-pong that both serves to keep the connection alive and test the health of the connection. In Reactor, I use a standard library to provide the WebSocket implementation, and this library is in wide use, so while it's almost certainly not bug free (nothing is), it has sufficient exposure to be considered trustworthy. I imagine Hubitat does the same thing, but since it's a closed system, I don't know for sure; they may use something common, or they may have rolled their own, or they may have chosen some black sheep from among many choices for whatever reason. In any case, neither Hubitat nor Reactor implement the WebSocket protocol itself, we just use our respective WebSocket libraries to open and manage the connections and send/receive data.
Apparently there is a failure mode for the connection, and we don't know if it's on the Hubitat (Java) side or in the nodejs package, where the events can stop coming, but apparently the ping-pong mechanism continues to work for the connection, otherwise it would be torn down/flagged as closed/error by the libraries on both ends. There's no easy way to tell if Hubitat has stopped sending messages or the nodejs library has stopped receiving or passing them, and since the libraries/packages on both ends are black boxes as far as I'm concerned, I don't really care, I just want it to work better. So...
HubitatController versions prior to 21351 relied solely on the WebSocket's native ping-pong mechanism to describe connection health, as Reactor does for Hass and even its own UI-to-Engine connection (lending credence to the theory that the nodejs library is not the cause). But for Hubitat it appears the WebSocket ping-pong alone is not enough, so 21351 has introduced some additional tests at the application layer. If any of these fails, the connections are closed and re-opened. When reopened, a full device/state inventory is done again as usual, so the current state of all devices is reestablished. Any missed device updates during the "dead time" would be corrected by this inventory.
By the way, one of the things that exacerbates the problem with the Hubitat events WebSocket is that it's a one-way connection at its application later: Hubitat only transmits. There is no message I can send over the WebSocket for which I could expect a speedy reply as proof of health. I have to find other things to do through MakerAPI in an attempt to force Hubitat to send me data over the WebSocket, and this takes more time as well. If there was two-way communication, it would be a lot easier and faster to know if the connection was healthy.
So the question that remains, then, is what is that timing? By default, HubitatController will start its aggressive recovery at 60 seconds of channel silence. If the channel then remains silent for an additional 60 seconds, the close/re-open recovery occurs. So even if the connection fails, the maximum time to recovery and correct state of all devices will be just over 120 seconds. So even in worst-case conditions, entity states should not lag more than that. Given that these stalls are the exception rather than the rule for most users, these pauses should be rare.
There is one tuning parameter that may be useful to set on VPN connections or any other "distanced" connection (i.e. any connection where the Reactor host and the hub are not on the same network segment, and in particular may traverse connection-managing software or hardware like proxies, stateful packet inspection and intrusion detection systems, load balancers, etc.). That is
websocket_ping_interval
, which will be added to the next build. This will set the interval, in milliseconds, between pings (default 60000). This should be sufficiently narrow to prevent some VPNs from aborting the socket in some cases (see the WebSocket missive at the end), but if not, smaller values can be tried, at the expense of additional network traffic and a slight touch on CPU. If the reconnects don't improve significantly, a different VPN option should be chosen.And this brings me to two recommendations:
- You should not use a WiFi connection for either the Reactor host or hub in production use. These are fine for testing and experimentation, but are an inappropriate choice in production for both reliability and performance reasons.
- If you use a VPN between the Reactor host and the hub, "subscription VPNs" are probably best avoided, as these will be the most aggressive in connection management and cause the most disconnects and failures. That's because they are tuned for surfing web traffic and checking email, basically, where the connections are open-query-response-close — connections don't stay open very long, typically. There are optimizations of HTTP where connections are kept open after a response to allow for a follow-up query (e.g. request an embedded image after requesting a document), but these are generally much shorter than the expected infinite open of a WebSocket connection (more on this at the bottom). Point-to-point VPNs that you set up and manage yourself are likely to provide better stability and performance (e.g. PPTP, SSH tunnels, etc.).
I will also add this: in my network, I do not get Hubitat WebSocket stalls and reconnects. I have had to force them through various devious means to test the behavior I've just implemented. I owned a commercial data center in the San Francisco Bay Area, with a managed network offered to clients with 100% uptime service level agreements. I built and maintained that network. My home network is a reflection of that — good quality equipment, meticulous cabling, sensible architecture (scaled down appropriately for the lesser scope and demands), and active data collection and monitoring. My network runs clean, and when there are problems, I know it (and where). If your Reactor host and Hubitat hub are on the same network segment, hardwired and not WiFi, and you are getting reconnects, I think you should audit your network quality. Something isn't happy. It only takes one bad cable, or one bad connector on one end of one cable, to cause a lot of problems.
---
For anyone interested, one reason why WebSockets can be troublesome in network environments where connection management may be done between the endpoints is that a WebSocket typically begins its life as an HTTP request. The client makes an HTTP request to the server with specific headers that ask that the connection to be "converted" (they call it "upgraded") from HTTP to WebSocket. If the server agrees, the connection becomes persistent and a new session layer is introduced. But because the connection starts as HTTP, any interstitial proxy or device that is managing the connection as it passes through may mistake it for a plain HTTP (web page) request, and when the connection doesn't tear itself down after a short period the proxy/device thinks is reasonable for HTTP requests, it forces the issue and sends a disconnect to both ends. This is necessary because tracking open connections consumes memory and CPU on these devices, and in a commercial ISP environment this could mean tens of thousands or hundreds of thousands of open connections at a single interface/gateway. So to keep from being overwhelmed, these devices may just time out those connections (on a predictable schedule/timeout, or just due to load), but because it's not really an HTTP connection at that point (it's been upgraded to a WebSocket), the proxy/device is breaking a connection that both the server and client expect to be persistent, and that can then cause all kinds of problems, the most benign of which is forcing the two endpoints to reconnect frequently and waste a lot of time and bandwidth doing it. On a LAN, you typically don't have these problems, because the two endpoints have no stateful management between them (network switches, if present between, just pass traffic, not manage connections), so barring network problems, there's no reason for them to be disconnected until either asks to close.
-
@toggledbits master!
Another lesson in knowledge, and dedication to understanding and solving problems.
Well, I spent two days reading your post before trying to answer anything.
First starting from the end, in my case my network is all Giga, CAT7 cables with industrial connectors. All the cabling of the house came with cables ready to not run the risk of redoing connectors. Then I even hired a company to certify the network, which also uses management switches that I can validate the network. So ping inside my house between any equipment is < 1ms.
From the technical side what I see is that the connection between MSR and Hubitat is fragile, and it becomes even more so when MSR is much faster in its actions without Hubitat responses.
What you would be doing for a future version is strengthening the validation of this communication, in particular, to return the status of given orders. That is, if you tell the MSR to turn on a light, make sure that the return state says that the light is on.
I was in doubt, in case of saying that it was not turned on, would there be a reset? Could this be a summary?
-
Ping (and traceroute) are not network quality tools. They are path tools. To measure link quality, since you said you had managed switches, you'd need to look at the error counts on each port. That may be worth a squiz.
As @Alan_F has observed, you can get restarts from a quiet channel that is just naturally quiet. If you don't have a lot of devices, and things aren't changing often, it's very likely to see a quiet channel. HubitatController probes by picking a device and making it refresh, which usually causes some event activity on the WebSocket channel. But it's possible that the device it picks (randomly) may not do much when refreshed. You can set
probe_device
to the device ID (number) of a device that consistently causes events when asked to refresh to mitigate that random effect. That may take some experimentation, using the Hubitat UI to ask devices to refresh, and watching for green highlights in the Reactor Entities list. When you find a device that consistent "lights" when you refresh it on Hubitat, you've probably got a reliable probe device.I have noted in my own network that certain devices (that shall be unnamed) are sufficiently chatty that I'm at no risk of a quiet channel. Let's just say anything that monitors energy is a good canary in the mine.
-
The ping tests were to validate the connection, speed, below is the screen of the switch that the MSR and Hubitat are on, no errors. I have a weekly reset programmed, so this information is from last Friday until now.
Ok, so what I should set now is a device connection probe, something like the APP Device Watchdog. I'm going this way.
Anyway, I think I mentioned before, I ended up moving Hubitat far away from my wifi router on Saturday, and since last night after deactivating the devices again one by one, I started rebuilding my mesh. I already see good results with the move away, several devices that previously had trouble responding, now seem to work better. Hopefully, by Wednesday I will have finished the rebuild, and will be able to check how the network behavior and ping-pong between MSR and Hubitat is now with the latest version of MSR.
Thanks.
-
Very good. In my limited experience with this new heuristic, Z-Wave devices seem to be the most predictable for probes among the basic devices. In the current build, whatever you choose must support the
Refresh
(Hubitat native) capability (akax_hubitat_Refresh
in Reactor). In the next build, the config valueprobe_action
will be available to let you use a different action, if you need to, andprobe_parameters
(an object) containing any key/value parameter pairs that the command may need (optional).I've also found that the "Hub Information" app (or more correctly, the device that this app manages) makes a good probe target. It has some useful data, and also very reliably updates fields when commanded to do so, even at a high frequency, so the next build will have specific support for this app/device if it finds it among the hub's inventory.
-
@toggledbits hi!
I have been operating for two days with version 23360, and the chaos described in the messages above is no longer happening. The recurrent failures of leaving an action without executing as a whole are no longer oberved.
Thanks for all your efforts!
A question, in version 21353 we started seeing the message below, which you made possible after deactivating the alert.
Is it possible to make some kind of query, or trigger some action when this message happens? I would like for example to send a Telegram to notify me.
Thanks.
-
Yes, there are at least two ways to do that now. Any thoughts on where you might look?
Edit: Same question as this thread: https://smarthome.community/topic/832/identifying-when-msr-cannot-connect-to-home-assistant
And this thread is wandering, so maybe let's leave it alone.
-
T toggledbits unlocked this topic on
-
Hi, I asked @toggledbits to reopen this very long thread, to give a testimony of a situation that I believe can help others.
I apologize for the long message, let's recapitulate the history.
The discussion was based on possible device limits on MSR actions, which Patrick explained there would not be, but the situation described was very similar to the scenario I posted in my message, of numerous failures on actions sent from MSR to Hubitat (December 16, 12:28am). That MSR was much faster than Hubitat could process.
I posted an example where I had to execute a series of Reactions that would turn on lights, and also turn on outlets, it always failed, I demonstrated that the MSR would activate everything, but Hubitat would not execute.
Then came the topic and orientation that my Hubitat should not be next to my WiFi router that could be interfering with the Hubitat signal, I even sent a picture on December 17 and provided the change %(#ff8000)[(TIP 1)].
Well obviously, as the change of position was big, I had to redo 3 times the entire mesh network, adding and removing devices. I followed the instruction of many masters, do the network from the center, i.e., from Hubitat to the outside, including first devices that use electric power because they are repeaters, and then those with exclusive battery power %(#ff8000)[(TIP 2)].
Well, our friend Patrick releases version 21351 and then 23360 where he adds much more aggressive management in the communication MSR x Hubitat, it improved a lot. But unfortunately, I kept having problems.
Then I asked for help again to go forward and see what to do to improve, we entered in the theme that several posts from the Hubitat community mentioned devices that use the S0 security, that this creates problems in the mesh network by high traffic of unnecessary information, new action remove and include again the devices that had S0 %(#ff8000)[(TIP 3)], another action that helped the network. What was not possible, we reactivated the Vera hub and put these devices back in, removing them from the Hubitat network.
We were evolving, but the situation persisted, actions that triggered many actions to Hubitat could still have failures, and the worst actions using Hubitat's own dashboard were also not being executed.
Well, I returned to the discussion of when I changed the Vera to Hubitat, which highlighted several points of change, but one very bothered me, the Hubitat Z-Wave signal, much weaker than the vera (https://smarthome.community/topic/776/switching-from-vera-to-hubitat/9?_=1644189698709).
Well 4 days ago (2/2), moved by the courage I opened my Hubitat and followed the post (https://community.hubitat.com/t/external-antenna/81396/28) %(#ff8000)[(TIP 4)], and installed an external antenna for z-wave, here I show that I bought and installed (https://community.hubitat.com/t/elevation-c7-possible-faulty-z-wave-radio/52977/91). In this post the discussion started with the theme S0 and S2, and went into the antenna theme.
MY TESTIMONY OF WHAT HAPPENED
A revolution, my Hubitat got a new life, it is another equipment:
- Before I had 15 direct devices in the hub, today after 4 days there are already 36 of 64, and I see that every day is increasing as the network is being restructured. There was the absurdity of equipment in the same environment as the HE, but behind a column, using two other devices to reach the HE that was less than 4 meters away, now communicates directly;
- There was almost no equipment that communicated at 100kbps, most were between 9.6 and 40, now most are already 100, and a small number, 5/64 are at 9.6 kbps;
- Remember the thing where I had to turn on several lights and power outlets all together? it didn't fail anymore, as the devices speak better and faster like the HE, I don't see this failure anymore;
- I also talked about actions commanded by the Dashboard that the device did not respond to, it is not happening anymore either.
In summary, in my case that 3/4 of the devices are not repeaters, they use batteries, my z-wave mesh network had a lack of repeaters, and this generated a generalized degradation. Now, with this better signal, if not eliminated the problem, I reduced it to almost zero.
Now pay attention, the operation of putting up the external antenna seems simple, but it is not. The antenna connector that is soldered to the board is very small and difficult to handle. So if you go this way, look for a cell phone repair shop, they will surely have the best technique for this change.
Thank you, and sorry again for the long message.
@gwp1 maybe this can help if you still have a problem. Your tip to move the HE away from the Wifi was also precious, thank you.
@SweetGenius your comment that there might be an overwhelming in the hub was correct, the action_pace action was a help, but the signal improvement I describe was the solution when I have a more fast response of the devices, reducing the overwhelm. Thanks.
@toggledbits our last messages, before I wanted to incinerate Hubitat, also helped a lot on the way. Thank you for all your dedication.
-
T toggledbits locked this topic on