Premise Z-Wave Status using RZCOP/VRCOP help

Motorola Premise
123, your code is very neat! I'm impressed with the object morphing and the group implementation. The morphing works great!

The way you are designing the zwave classes is very smart too. The module has mostly been designed generically and one could easily use it with a different controller or lighting technology.
 
... Polling works better if you poll one device at a time. ...setting the timing at 100ms ...but ~1/7 polls still receive an X002 from one random device. This really only happens on polling.

... use command acknowledgement and send the next command only after receiving X000. ...
etc6849,

I've been studying your driver and I have a few questions:

The documentation states that "<E000" and "<X000" are produced after a command is successively executed but that "<X000" may come before or after the response is received (tricky!). Transmitting another command before receiving an "<E000" response could result in overflowing the RZC0P's buffer. Transmitting another command before receiving "<X000" ... not sure what happens in this case ... the Z-Wave network may be busy processing the first command when the second one is issued and simply ignore it.

I believe one needs to wait for a "<E000" response (minimally) before transmitting additional commands. This ensure the interface received and processed the command. Waiting for an "<X000" would ensure the previous command has been sucessfully processed by the network and it is ready to process another command. Beta5's polling method (PollDevices) iterates through all Devices and uses a 100ms delay between transmissions. That means every 100ms a ">?N" is transmitted for each Device. I'm not certain but this might flood the network and that's why you're seeing a less-than-perfect success rate (1/7 failures). This is just a guess but transmitting commands in rapid succession in a larger Z-Wave network might exhibit an even higher failure rate.

To improve the success rate, I believe flow-control is needed in order to throttle the transmission of commands. In other words, before sending a second command, the driver needs to wait for a "<E000" confirmation ... maybe even an "<X000" as well. Naturally, while waiting for a response to the first command the driver may get a request to send another one. Clearly the driver cannot ignore the second request and must store it in a queue and then process it at the earliest opportunity. I'm currently experimenting with this idea .. the queue could simply be an array or a container that holds child "command" objects. In addition a watchdog timer is needed to ensure that the driver doesn't wait forever for the "<E000" confirmation.

In OnChangeOnNewData, there's this line:
Code:
	'set flag to prevent feedback loop
	this.Updating = true
So when new data is received, the Updating property is enabled. Then in SendCommand it has the following:
Code:
'if we aren't updating based on feedback, send the command
if this.Updating = false then
So when OnChangeOnNewData is processing incoming data, it prevents SendCommand from transmitting data. This sounds reasonable but I don't understand the comment's point about preventing a feedback loop. Can you shed some light on it?

I'm of the opinion that if an "<X002" is received (transmission of the requested command has failed) ... well, for now, tough luck. Perhaps a future version of the driver will retry transmitting the failed command.
 
Thanks 123 for all your expertise and help! An in/out buffer array sounds like a great and clever idea. I too believe waiting for <E000 before sending another command will get rid of the X002 issue altogether.

Feedback Prevention Flag:
The only example I can think of is the flag will help stabalize a value for things like receiver volume (where repetitive values are sent, this is in contrast to brightness) if two way control is used. This was touched on in one of Damon's videos where he made an Onkyo receiver driver and that's where I stole it from.

It turns out that if someone were to turn the volume up under the automation browser, many volume values will be transmitted to the Onkyo receiver and the receiver will report the status for each level as the volume is increased. This would mean values have the potential to be unstable as the received values may be lower than those just transmitted due to timing characteristics and the onchangenewdata script may fight back and lower the volume some when you are actually trying to raise it!

I'm not sure one needs the flag to prevent feedback since brightness values are not continuously sent as you hold down the brightness up button in the automation browser (ie good catch 123 I don't think the feedback flag is needed :nutz: ). However, I still don't have a thermostat so it could be an issue with some thermostats. I'm also not yet sure how the thermostat buttons will behave under the automation browser.
 
I've built a "JobQueue" class in Premise that is effectively a First-In, First-Out (FIFO) buffer. If one wishes to implement flow-control in a driver (i.e. send a command and wait for acknowledgement before sending the next pending command) a JobQueue can store, and process, the outbound commands.

A JobQueue is a collection of Jobs where each Job hold a command string. JobQueue has just four methods:
  • AddJob -> Add a new job to the queue.
  • RunJob -> Add a new job to the queue and run it immediately.
  • ProcessCurrentJob -> Send the command string contained in the current job.
  • GetNextJob - > Set the next job in the queue as the current job and process it.
and two properties:
  • JobIndex -> An incremented number that is assigned to a Job's name.
  • CurrentJob -> Identifies the current job in the queue.
To use it is simplicity itself. You don't need to bother with any of JobQueue's methods or properties. Just create a method in your driver class called SendCommand that takes one parameter ("Data": the command string to transmit) and add the following code:
Code:
if this.Jobs.CurrentJob <> "" then
	this.Jobs.AddJob method.Data
else
	this.Jobs.RunJob method.Data
end if
Here's how JobQueue works:

SendCommand"MyCommandString"
If there are existing Jobs in the JobQueue, SendCommand uses AddJob to append a new Job to the queue.
If there are no existing Jobs, SendCommand uses RunJob.

AddJob
AddJob simply adds a new Job to the JobQueue. The Job contains the command to be transmitted (i.e. "MyCommandString"). The Job is given a unique name.

RunJob
In contrast, RunJob adds a new Job to the queue (via AddJob) and then immediately processes it using ProcessCurrentJob.

ProcessCurrentJob
ProcessCurrentJob gets the command string from the current job and transmits it. It then activates a Watchdog timer. If an acknowledgement is not received within a given time period, the Watchdog will simply delete the current job and move on to processing the next job in the queue. If the driver's OnChangeNewData method receives an acknowledgement promptly, it calls GetNextJob.

GetNextJob
GetNextJob deletes the Watchdog timer and the current job. If it finds another job in the queue, it sets it as the current job and calls ProcessNextJob. If there are no other jobs in the queue it sets JobIndex to zero and CurrentJob to null.

Here's how to test the JobQueue.

  1. Run DebugView so you can see the debugging messages.
  2. Delete Modules > zWave.
  3. Install the attached XDO.
  4. Make a new zWave object in Devices > CustomDevices
  5. You'll notice it contains a "JobQueue" object. In a release version, "JobQueue" would be hidden.
  6. In the zWave object, click the "transmit" property at least three times. This simulates the generation of three successive outbound commands.
  7. Expand the JobQueue you'll notice three Jobs: "Job0", "Job1", Job2", etc
  8. Click the "receive" property once. This simulates the acknowledgement of the transmitted command.
  9. "Job0" will disappear and now "Job1" is processed.
  10. Don't do anything for at least ten seconds.
  11. The Watchdog timer will expire after ten seconds (release version would use a shorter time period) and automatically purge "Job1" and move on to process "Job2".
  12. Click "receive" once.
  13. "Job2" is effectively acknowledged and will disappear from the JobQueue.
  14. The JobQueue is now empty.
  15. Click "transmit" and a new "Job0" is created.
This is about as clean a design as I can invent. Aside from the Watchdog timer there are no other timers, or timing loops, involved. The jobs are processed as fast as the acknowledgements are received and the Watchdog timer simply ensures that processing never stalls.
 

Attachments

  • zwave3.zip
    10.3 KB · Views: 19
  • zwave3.png
    zwave3.png
    16.8 KB · Views: 29
Wow 123! This is really a nice implementation.

To test your implementation in the real world: I've added the job classes to the beta v5 driver. I set command time to 100ms command time and also added some test variables called JobCount and ErrorCount (counts X002 occurances) to the VIZIA driver. I also sent no other control signals over the zwave network during the test (ie I didn't use the handheld Vizia controller). The number of devices on my network is 4 (not including the rs232 controller and the handheld remote).

With the poll interval set to 10 seconds: 1 transmission error X002/ 1002 jobs processed (~250 polls to 4 devices).

With the poll interval set to 3 seconds: 1 transmission error X002/ 600 jobs processed (150 polls to 4 devices).

Obviously 3 second poll intervals is not what you would use in the real world. Perhaps 5 minute poll intervals is a good number which would mean there would be relatively rare occurance of transmission errors (if any). This new version appears to be very reliable and I think it works great! 1/1000 errors isn't bad I think for a lighting system!

What do you think about automatic reset?
I added a reset port script that will remove the current serial port from Network, then re-add it. The reset property is toggled every time there is no response from the rs232 controller (ie no Exxx or bad packet).
 

Attachments

  • leviton_beta_v7.zip
    10.5 KB · Views: 21
That's what I like to see! That's a substantial improvement over a 1 in 7 error rate!

I'll have a look at the reset code; it sounds like a good idea.

Eventually, we need to add support for the CommunicationFailure property. This property is found in all native drivers (I included it in the WeederTech driver) and indicates when the driver has lost communications with the physical device (i.e. the RZC0P). A standard thing to do when this happens is to have the driver log an Event.

Here's my idea for Groups. Each Group will have three properties:
  1. GroupNumber (Integer)
  2. ProgramGroupNumber (Boolean, Momentary)
  3. Programmed (DateTime)
So if you set the GroupNumber for "AllLights" to 125 and then click "ProgramGroupNumber", the driver will time-stamp "Programmed" and send a Group Store command ("N1,2,3,4GS125") to the RZC0P. This is something the end-user would (optionally) do for each group.

Now when you set a Group's PowerState to ON, the driver will check to see if the Group has an assigned GroupNumber (i.e. a non-zero value). If so, the driver will issue a Group Recall command ("GR125ON") to turn on lights in the group. On the other hand, if GroupNumber is zero, the driver iterates through all members of the group and sends individual commands (N1ON,UP"). Naturally, sending one GR command is more efficient than sending one N command per device.

The "Programmed" property is purely a convenience to remind the end-user when the GroupNumber was programmed.

QUESTION
Do you know if sending this command ">N1,2,3,4,5,6ON,UP" is as reliable as sending indvidual commands like ">N1ON,UP", "N2ON,UP", etc? Is it more, or less, time-efficient?

I'm trying to determine how a Group should be handled if the GroupNumber is not assigned. Should I send one long list (">N1,2,3,4,5,6ON,UP") or a bunch of individual commands (">N1ON,UP").
 
I too think 1/1000 transmission failures is great. Especially considering my apartment is multiple levels and the signals may go through multiple walls/floors to reach the dimmer. I'm running a real world polling test this week with 2 minute polls combined with normal usage of the zwave network to test the driver. I'll report my results on Friday.

CommunicationFailure
Great idea 123! For the CommunicationFailure property, I would implement it by doing the following:
I already have a property of VIZIA called resetPortCount that I was going to delete later after testing, but why not keep it around? All one would do is add a timer that resets resetPortCount to 0 every half hour; then write an onchange script for resetPortCount that triggers a CommunicationFailure if there have been more than 5 port resets in the last half hour. I think this is a reasonable test as we would want to try resetting the port a few times before saying there is a true communications failure. I'm not sure about the code to log the event though... PS: Let me know if you need me to make this change or if you are already doing ;)

Group Question/Thoughts
I'll have to check on all this stuff as I haven't played with groups. However, I believe it is better to send commands to one device at a time now that we are properly testing for E000 before sending the next command. This is what the forum on zwaveworld seemed to say too. In fact, I believe I received X002 a few times last week when I tried to do something like ">N1,2,3,4,5,6ON,UP" all at once, but that was when I was using a 10ms command delay time (which is way too short).

I like being able to make a folder and not having to worry about programming/saving group names (but still having this option). This is a clever idea 123! Your implementation for group properties makes perfect sense to me. I'm going to study group commands tonight, but does >?GR return all groups and their nodes? Perhaps, it might be worth while then to write a script to automatically detect existing nodes? This is obviously an optional not required add on.

Future Improvements
These items sound like something to add in the future after there is a more finalized driver but...

Motion Sensor support?

Button Controller Support?
It appears that the rs232 controller can read button associations? Perhaps it's worth while to add a new type of device such as a generic four button controller where a user can add their own code to each button under a home keypad or define associated nodes for each button and hit program. I don't have one of these $100 + controllers to test though :( I do have a handheld programmer and it may act like a four button controller. I can play with it tonight and see what all I can do with this type of functionality.
 
... I'm running a real world polling test this week with 2 minute polls combined with normal usage of the zwave network to test the driver. ...
I look forward to seeing your results. Polling is an important feature that will make this driver more useful for non-ViziaRF Z-Wave devices.

... triggers a CommunicationFailure if there have been more than 5 port resets in the last half hour .... I'm not sure about the code to log the event though...
In other drivers, I've created a "heartbeat function" that periodocally queries the hardware device and looks for a specific response. If the response is not received after three queries, the assumption is that the connection to the hardware has failed and "CommunicationFailure" is enabled. The trick is to find the right ViziaRF command to use.

Here's some OnChangeCommunicationFailure code from one of my drivers. "LogFailures" is a Boolean that lets the end-user choose if failures should be logged as Premise Events.
Code:
if sysevent.newVal and this.LogFailures then
	' Create an entry in the Premise Event Log
	dim oEvent
	set oEvent = Events.CreateObject(Schema.System.Event.Path, "Communications Failure")
	with oEvent
		.Description = "No response from Premise TTS Driver Service: " & this.Name
		.Severity = 50
		.EventTime = Now
		.LinkObject = this
	end with
	set oEvent = nothing
end if

... I believe it is better to send commands to one device at a time ... I received X002 a few times ... when I tried ... ">N1,2,3,4,5,6ON,UP" all at once, but that was when I was using a 10ms command delay time (which is way too short).
Can you confirm if the current driver version (with command queueing and a 100ms delay) handles ">N1,2,3,4,5,6ON,UP" better than the previous one? Also, do lights turn on simultaneously with ">N1,2,3,4,5,6ON,UP" or do they turn on one-at-a-time like they would if called individually with ">N1ON,UP"?

... does >?GR return all groups and their nodes? ... it might be worth while then to write a script to automatically detect existing nodes?
I've been looking through the XML documentation for Johnny 9's .NET ViziaRFLibrary and it has a command that "Gets or sets a list of all Node IDs in the group." This suggests that there is a ViziaRF command that returns a list of nodes contained in a group. However, I don't know about a single command that gets a list of all groups and nodes.

If one is able to determine the nodes in a group (from the RZC0P), this information can be used to update the driver's Groups. I guess this "housekeeping function" would be executed by the driver on a periodic basis. Definitely something worth investigating for a future version.


BTW, I'm exploring what is need to enhance the JobQueue so that a Job can be re-issued in the event of failure. The Job would be purged only if it fails after being issued three times or by the Watchdog. This is in contrast to the current scenario where the Job is purged after the first failure (or by the Watchdog).
 
I've revisited the topic below about the feedback prevention flag.

if we change brightness of the light from SYS with no feedback prevention:
>N002L60,UP
>N002L60,UP
<E000
<X000
<N002S000,060,000
>N002L60,UP
>N002L60,UP
<E000
<X000
<N002S000,060,000

with feedback prevention:
>N002L38,UP
>N002L38,UP
<E000
<X000
<N002S000,038,000

if we change brightness of the light manually with no feedback prevention:
<N002:130,001
<N002S000,049,000
>N002L49,UP
>N002L49,UP
<E000
<X000
<N002S000,049,000

with feedback prevention:
<N002:130,001
<N002S000,050,000

Thanks 123 for all your expertise and help! An in/out buffer array sounds like a great and clever idea. I too believe waiting for <E000 before sending another command will get rid of the X002 issue altogether.

Feedback Prevention Flag:
The only example I can think of is the flag will help stabalize a value for things like receiver volume (where repetitive values are sent, this is in contrast to brightness) if two way control is used. This was touched on in one of Damon's videos where he made an Onkyo receiver driver and that's where I stole it from.

It turns out that if someone were to turn the volume up under the automation browser, many volume values will be transmitted to the Onkyo receiver and the receiver will report the status for each level as the volume is increased. This would mean values have the potential to be unstable as the received values may be lower than those just transmitted due to timing characteristics and the onchangenewdata script may fight back and lower the volume some when you are actually trying to raise it!

I'm not sure one needs the flag to prevent feedback since brightness values are not continuously sent as you hold down the brightness up button in the automation browser (ie good catch 123 I don't think the feedback flag is needed :) ). However, I still don't have a thermostat so it could be an issue with some thermostats. I'm also not yet sure how the thermostat buttons will behave under the automation browser.
 
Nodes 2 and 14 appear to be actuating at the same exact time like you thought they would :) However, the nodes don't send the update reply at the same time and there can be a small delay between updating 2 (<N002S000,099,000) and updateing 14 (<N014S000,255,000), but this isn't a big deal and is normal.

>N2,14,OFF,UP
>N2,14,OFF,UP
<E000
<X000
<N002S000,000,000
<N014S000,000,000
>N2,14,ON,UP
>N2,14,ON,UP
<E000
<X000
<N002S000,099,000
<N014S000,255,000

I also tried >N2,26,17,14ON,UP. I then hit the spacebar very fast to send it multiple times and received no errors.

>N2,26,17,14ON,UP
>N2,26,17,14ON,UP
<E000
<X000
<N002S000,099,000
>N2,26,17,14ON,UP
>N2,26,17,14ON,UP
<E000
<X000
>N2,26,17,14ON,UP
>N2,26,17,14ON,UP
<E000
<X000
>N2,26,17,14ON,UP
>N2,26,17,14ON,UP
<E000
<X000
>N2,26,17,14ON,UP
>N2,26,17,14ON,UP
<E000
<X000
<N014S000,255,000
<N017S000,080,000
<N026S000,099,000
<N002S000,099,000

PS: I've included the new code. It appears I forgot to divide by 100 in one spot :( and I've added the flag I talked about in my previous post.
 

Attachments

  • leviton_beta_v8.zip
    10.5 KB · Views: 15
... with no feedback prevention:
>N002L60,UP
>N002L60,UP
<E000
<X000
<N002S000,060,000
>N002L60,UP
>N002L60,UP
<E000
<X000
<N002S000,060,000

with feedback prevention:
>N002L38,UP
>N002L38,UP
<E000
<X000
<N002S000,038,000
I think I understand what you mean now by 'feedback prevention'. I encountered this issue when developing the ELK M1 driver which supports Dimmers and Appliances. I resolved it using a very different technique that does not involve "blocking" the operation of OnChangeOnNewData.

Here's what a Dimmer's OnChangeBrightness looks like in the M1 driver:
Code:
if not (sysevent.srcElement.Name=this.Parent.Parent.Name and sysevent.srcProperty.Name="OnNewData") then
	iLevel = round(this.Brightness * 100)
	this.PowerState = cbool(iLevel)	
	
	' Convert Premise range 1-100 to M1 range 2-99
	if iLevel = 100 then iLevel = 99
	if iLevel = 1 then iLevel = 2
	
	this.Parent.Parent.PLCDeviceControl this.OID, 9, iLevel, 0
end if
The first line is what does the trick. It ignores requests issued from a specific source. In this case, if the source is the M1_Panel driver object (this.Parent.Parent.Name translates to "M1_Panel") and the calling property is "OnNewData" then the request is ignored. For the ViziaRF driver, the first line would look like this:
Code:
if not (sysevent.srcElement.Name=this.Parent.Name and sysevent.srcProperty.Name="OnChangeOnNewData") then
....
end if
You can demonstrate this to yourself as follows:
  1. Disable the existing feedback prevention scheme.
  2. Add debugout statements to OnChangeBrightness and OnChangePowerState that show sysevent.srcElement.Name and sysevent.srcProperty.Name.
  3. Change brightness and powerstate (from SYS and from the physical devices) and watch Debugview to see the names of the sources.
You should see that the duplicated responses are generated by different sources.

QUESTION: why are there two successive commands being transmitted?
>N002L38,UP
>N002L38,UP
 
SYS has always repeated RS232 commands on my system (for all serial drivers). Is it abnormal or is there a way to turn it off?

QUESTION: why are there two successive commands being transmitted?
>N002L38,UP
>N002L38,UP
 
I'll have to check this on my system because I've never noticed duplicated transmissions. You're seeing this via Port Spy?

The attached file is a slightly modified version of your Beta 8 driver. It has a new SetLightLevel method that is called by OnChangeOnNewData for <N S and <N L commands. It does not change the existing feedback prevention scheme. It is simply a streamlined version of the code to set a Device's Brightness and Powerstate. Let me know if it works as I have no way of testing it (been looking for an RZC0P on eBay ...)
 

Attachments

  • Leviton_Beta_8_1.zip
    10.6 KB · Views: 21
Yes, the double command transmitions are seen under port spy. I'll test your new driver tonight and post the results.

FYI: you might also try a search for VRC0P on ebay too as I believe the protocols are identical. I bought my RZC0P on ebay for $54.99 as a buy it now; it was stated that it was old new stock. I did see that zwave products has it for $69.99, but I don't recall how much shipping is and haven't dealt with them before. http://www.zwaveproducts.com/ZWAVE-PC-CONT...-Interface.html
 
View attachment 2830
Yes, the double command transmitions are seen under port spy.
I set up a test-bed in order to observe the commands transmitted by the ViziaRF driver. The test-bed consists of a null-modem cable connecting two serial-ports (COM5 and COM6). Hyperterminal is connected to COM5 and the ViziaRF driver is on COM6.

Using Port Spy, I can see what's transmitted on COM6 and Hyperterminal let's me see what is received on COM5. I discovered that the ViziaRF Beta 7 driver sends only one instance of each command. If I enable a device's PowerState, the driver sends a single ">N001ON,UP" command.

Is it possible that what you believe are two instances is due to the way Port Spy displays data? Have a look at the attached image; there are two identical lines of displayed data but only one instance of the command is actually sent. Port Spy uses colours to convey the meaning of the data. The first line (Brown) represents the data waiting to be sent. The second line (Green) represents the the data that was sent.

------------------------------------------------------------------------

While testing the driver, I encountered something odd. I bound a light to the driver (a Dimmer), enabled its PowerState and waited. I don't have an RZC0P so there's nothing to respond to the transmitted command. Ten seconds later, the Watchdog timer expired, and purged the job. Everything worked correctly but then imagine my surprise as I watched, in Builder's Connections window, an animation of the COM5 object disappearing and then reappearing! ???

I discovered the following line in ProcessCurrentJob:
system.addTimer 10, "this.GetNextJob" & vbCrLf & "this.Parent.Reset = True", 1, "Job_Watchdog"

Why is the serial port being reset each time the Watchdog runs to completion? The Watchdog runs when the transmitted command fails to receive a reply (within X seconds). The failure could be due to causes that are unrelated to a faulty serial-connection. Perhaps the port can be reset only if the driver experiences 5 consecutive response-failures (i.e. not 5 cumulative but 5 in a row). Frankly, if you encounter 5 consecutive response-failures you might as well set the CommunicationFailure property and halt all processing.
 

Attachments

  • Port_Spy.png
    Port_Spy.png
    16.9 KB · Views: 22
  • Connections.png
    Connections.png
    16.5 KB · Views: 21
Back
Top