With the boom of video doorbells with the likes of Ring, Skybell and Next doorbell cams, I came to the realization that I did not want to be cloud dependent for this type service for long term reliability, privacy and cost.
I finally found last year a wifi video doorbell which is cost effective and support RTSP and now ONVIF streaming:
The RCA HSDB2A which is made by Hikvision and has many clones (EZViz, Nelly, Laview). It has an unusual vertical aspect ratio designed to watch packages delivered on the floor....
It also runs on 5GHz wifi which is a huge advantage. I have tried running IPCams on 2.4GHz before and it is a complete disaster for your WIFI bandwidth. Using a spectrum analyzer, you will see what I mean. It completely saturates the wifi channels because of the very high IO requirements and is a horrible design. 2.4GHz gets range but is too limited in bandwidth to support any kind of video stream reliably... unless you have a dedicated SSID and channel available for it.
The video is recorded locally on my NVR. I was able to process the stream from it on Home Assistant to get it to do facial recognition and trigger automations on openLuup like any other IPCams. This requires quite a bit of CPU power to do...
I also get snapshots through push notifications through pushover on motion like all of my other IPcams. Movement detection is switched on and off by openLuup... based on house mode.
Sharing a few options for object recognition which then can be used as triggers for home automation.
My two favorites so far:
Object detection for video surveillance. Contribute to asmirnou/watsor development by creating an account on GitHub.
GitHub - opencv/opencv-python: Automated CI toolchain to produce precompiled opencv-python, opencv-python-headless, opencv-contrib-python and opencv-contrib-python-headless packages. GitHub - opencv/opencv-python: Automated CI toolchain to produce precompiled opencv-python, opencv-python-headless, opencv-contrib-python and opencv-contrib-python-headless packages.Automated CI toolchain to produce precompiled opencv-python, opencv-python-headless, opencv-contrib-python and opencv-contrib-python-headless packages. - opencv/opencv-python
I have optimized my facial recognition scheme and discovered a few things:
My wifi doorbell, the RCA HSDB2, was overloaded by having to provide too many concurrent rtsp streams which was causing the streams themselves to be unreliable:
Cloud stream
stream to QNAP NVR
stream to home assistant (regular)
stream to home assistant facial recognition.
I decided to use the proxy function of the QNAP NVR to now only pull 2 streams from the doorbell and have the NVR be the source for home assistant. This stabilized the system quite a bit.
The second optimization was to find out that by default home assistant processes images every 10s. It made me think that the processing was slow but it turns out that it was just not being triggered frequently enough. I turned it up to 2s and now I have a working automation to trigger an openLuup scene, triggering opening a doorlock with conditionals on house mode and geofence. Now I am looking to offload this processing from the cpu to an intel NCS2 stick so I might test some other components than Dlib to make things run even faster.
Sharing what I have learned and some modifications to components with their benefits.
On home assistant/python3, facial recognition involves the following steps:
Even though a few components have been created on home-assistant for many years to do this, I ran into challenges which forced me to improve/optimize the process.
Home Assistant's camera does not establish and keep open a stream in the background. It can open one on demand through its UI but doesn't keep it open. This forces the facial camera component to have to re-establish a new stream to get a single frame every time it needs to process an image causing up to 2s of delays, unacceptable for my application. I therefore rewrote the ffmpeg camera component to use opencv and maintain a stream within a python thread and since I have a GPU, I decided to decode the video using my GPU to relieve the CPU. This also required playing with some subtleties to avoid uselessly decoding frames we won't process while still needing to remove them from the thread buffer. The frame extraction was pretty challenging using ffmpeg which is why I opted to use opencv instead, as it executes the frame synchonization and alignment from the byte stream for us. The pre-set pictures was not a problem and a part of every face component. I started with the dlib component which had two models for ease of use. It makes use of the dlib library and the "facial_recognition" wrapper which has a python3 API but the CNN model requires a GPU and while it works well for me, turned out not to be the best as explained in this article and also quite resource intensive:https://www.learnopencv.com/face-detection-opencv-dlib-and-deep-learning-c-python/So I opted to move to the opencv DNN algorithm instead. Home Assistant has an openCV component but it is a bit generic and I couldn't figure out how to make it work. In any case, it did not have the steps 5 and 6 I wanted. For the face encoding step, I struggled quite a bit as it is quite directly connected to what option I would chose for step 6. From my investigation, I came to this: https://www.pyimagesearch.com/2018/09/24/opencv-face-recognition/
"*Use dlib’s embedding model (but not it’s k-NN for face recognition)
In my experience using both OpenCV’s face recognition model along with dlib’s face recognition model, I’ve found that dlib’s face embeddings are more discriminative, especially for smaller datasets.
Furthermore, I’ve found that dlib’s model is less dependent on:
Preprocessing such as face alignment
Using a more powerful machine learning model on top of extracted face embeddings
If you take a look at my original face recognition tutorial, you’ll notice that we utilized a simple k-NN algorithm for face recognition (with a small modification to throw out nearest neighbor votes whose distance was above a threshold).
The k-NN model worked extremely well, but as we know, more powerful machine learning models exist.
To improve accuracy further, you may want to use dlib’s embedding model, and then instead of applying k-NN, follow Step #2 from today’s post and train a more powerful classifier on the face embeddings.*"
The trouble from my research is that I can see some people have tried but I have not seen posted anywhere a solution to translating the location array output from the opencv dnn model into a dlib rect object format for dlib to encode. Well, I did just that...
For now I am sticking with the simple euclidian distance calculation and a distance threshold to determine the face match as it has been quite accurate for me but the option of going for a much more complex classification algorithm is open... when I get to it.So in summary, the outcome is modifications to:
A. the ffmpeg camera component to switch to opencv and enable background maintenance of a stream with one rewritten file:
https://github.com/rafale77/home-assistant/blob/dev/homeassistant/components/ffmpeg/camera.py
B. Changes to the dlib face recognition component to support the opencv face detection model:
https://github.com/rafale77/home-assistant/blob/dev/homeassistant/components/dlib_face_identify/image_processing.py
C. Modified face_recognition wrapper to do the same, enabling conversion between dlib and opencv
The world's simplest facial recognition api for Python and the command line - rafale77/face_recognition
D. And additions of the new model to the face_recognition library involving adding a couple of files:face_recognition_models/face_recognition_models at master · rafale77/face_recognition_models face_recognition_models/face_recognition_models at master · rafale77/face_recognition_models
Trained models for the face_recognition python library - rafale77/face_recognition_models
init.pyface_recognition_models/face_recognition_models/models at master · rafale77/face_recognition_models face_recognition_models/face_recognition_models/models at master · rafale77/face_recognition_models
Trained models for the face_recognition python library - rafale77/face_recognition_models
Overall these changes significantly improved speed and decreased cpu and gpu utilization rate over any of the original dlib components.
At the moment the CUDA use for this inference is broken on openCV using the latest CUDA so I have not even switched on the GPU for facial detection yet (it worked fine using the dlib cnn model) but a fix may already have been posted so I will recompile openCV shortly...
Edit: Sure enough openCV is fixed. I am running the face detection on the GPU now.
At the moment i'm using the Surveillance software in Synology but I'm limited to 6 cameras (2 included and I took a 4pack a while ago)
But I have 8 cameras, so right now, 2 of them are not in the NVR!
I checked back in time motioneye but this software is very slow and all my cameras feed was lagging...
any other solution? 😉
Sharing an excellent skill I use to locally stream from my IPCams to echo shows:
MonocleYour video stream does not need to go to the cloud. This skill just forwards the local stream address to the echo device when the camera name is called. It does require them to host the address and camera information (credentials) on their server though. I personally block all my IP cameras from accessing the internet from the router.
Video Doorbells
-
Thank you for bringing this up... The big massive main discussion around this doorbell on IPCamTalk is about power and the old chime. When I switched to Skybell I had moved from an electronic chime to a more traditional mechanical chime. As a result I never faced any problem. It appears that electronic chimes are less standardized and various models can give problems.
In the US doorbells are powered by low voltage AC (similar to yard lighting) with a lot of tolerance on voltage. This doorbell takes both AC and DC... 12 to 24V... but this must work with the chime too. In my case I had to upgrade my power supply to a 30W one. This tiny wires over long runs have a lot of power loss so I went to 20V AC. You get more power loss with DC than AC. You also get more power loss with lower voltages. (Joules law)
As for your first question, you can install this in your chime:
For less than $5, you get a zigbee doorbell notification. I don't use the talk back function, only the snapshot. If I need the inter phone functionality, I can enable the cloud function which is free. I keep mine enabled but I have redundancy... so I am as not to be cloud dependent.
For information, this is the thread I have been mentioning with the 101 about this unit which summarizes almost everything:
-
Hmmm ... that way it is not really a doorbell but a snapshotter :-). I have that for years... haha...
A hikvsion cam connected to synology surveillance station and parrallel to a fibaro universal sensor. Before the cam was in vera (directly) and the veraalerts app sent me a snapshot via mail when motion was detected.
Recently I moved to home assistant, integrated the vera devices there, have node red, telegram and now I receive an INSTANT notification when doorbell is pressed and an instant snapshot from 2 cameras, doorcam and front yard cam. Only thing I would really like (for after corona period) to talk back and have an instant video and audio stream... -
My door is covered by a hidden microwave sensor. when triggered reactor sends a http request to my phone or tv box, depending on home or away. My phone/box uses Automate to receive the http request, Automate then opens the camera app and displays the camera. I can talk and listen though the camera app. Normally takes 5-10 seconds to complete. If i am home vera sends a TTS to alexa, informing me there is some one at the door.
-
I use a
https://www.ebay.co.uk/itm/AC-110-220V-Microwave-Radar-Sensor-Switch-Body-Motion-Detector-for-LED-Light/274176461640?ssPageName=STRK%3AMEBIDX%3AIT&_trksid=p2057872.m2749.l2649
connected to a Shelly 1, which is integrated with @therealdb Virtul HTTP Switch as a sensor. -
@sender
Be careful with the placement as these sensors can see through doors and thin walls, You can block the sensor with thin metal to stop false trips.
I mounted my sensor in meter cupboard and place a section of angle iron at bottom to stop trips from cats etc. The white box at bottom contains the sensor and Shelly
-
Video over wifi is ok IF the bandwidth is efficient. Usually that is not the case when you start adding multiple wifi cameras to your network and you quickly understand how unreliable a wifi connection can be.
My suggestions here is always:
- Keep wifi cameras on a dedicated WLAN network.
- Calculate your bandwidth and never go more than 80% of recommended throughput depending on your wifi specs.
- If you are distributing the camera feed to multiple clients ALWAYS configure your camera/dvr/clients to use multicast.
When it comes to the topic at hand and just for informational purposes my doorbell is from DoorBird and is connected through POE and so far it's been the most reliable/flexible solution I've tried yet.
-
°@rafale77 said in Video Doorbells:I also get snapshots through push notifications through pushover on motion like all of my other IPcams. Movement detection is switched on and off by openLuup... based on house mode.°(information text)
How do you use pushover from openluup I’ve only been able to do growl and sms. Also being able to include framegrab in message would be great.
-
I use a home assistant binding for this:
I have setup Home assistant with the pushover component:I then wrote a scene on the vera to send a command to home assistant to take the snapshot from the camera I want and embed into the notification as documented above. I shared the snippet to send commands to home assistant in the snippet/code section.
-
@Elcid said in Video Doorbells:
My door is covered by a hidden microwave sensor. when triggered reactor sends a http request to my phone or tv box, depending on home or away. My phone/box uses Automate to receive the http request, Automate then opens the camera app and displays the camera. I can talk and listen though the camera app. Normally takes 5-10 seconds to complete. If i am home vera sends a TTS to alexa, informing me there is some one at the door.
I have a BTicino (part of Legrand, very common in France/Italy/Spain) doorbell. They have a proprietary system, but it's basically a video doorbell. I spent almost 1k to have it installed, but it's doing multiple entries, multiple inside points, etc. I have it hooked up with a binary sensor to get doorbell status, plus a switch to open the gate. It natively supports Android/iOS and I get a call and I can reply even when outside. I use it to let the packages in my porch, etc, since here in Italy open garden on the front yard is not common, we have the whole property closed to external guests.
I too have a similar setup as @Elcid. When someone rings, I send a notification to our phones and I turn on camera recording (I turn it OFF during the day, unless events are triggered), and I turn on a tablet in the open space to show all the external camera. It's near the video doorbell proprietary tablet, so you'll see the person on the doorbell, plus all the cameras. Everything is pushed via my telegram bot.
If it's night, I turn on all the lights (I have more lights outside that I turn on during parties or when doors/gates are opened). Overall, I like my setup, the only thing I regret is that this system is closed and I can't capture a screenshot of the wide-angle camera easily. I wanted to separate this from the HA system to have stability - and I was right, from that point of view.
-
@therealdb said in Video Doorbells:
Overall, I like my setup, the only thing I regret is that this system is closed and I can't capture a screenshot of the wide-angle camera easily.
My NVR his so old that the SSL encryption is obsolete, so the email no longer functions. I capture images with Automate and use Automate to send an email with the screenshot to my device or FTP server. If your system has an App it may be possible to capture the screen with Automate.