Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blacklisted Script #190

Open
1 of 2 tasks
mkeevy opened this issue Mar 6, 2019 · 8 comments
Open
1 of 2 tasks

Blacklisted Script #190

mkeevy opened this issue Mar 6, 2019 · 8 comments

Comments

@mkeevy
Copy link

mkeevy commented Mar 6, 2019

This is a...

  • Feature Request
  • Bug Report

Problem:
I have a script rule that reports the following debug information when it fails occasionally.
javax.script.ScriptException: java.lang.RuntimeException: java.lang.IllegalStateException: Executor thread not set after 100 ms
at delight.nashornsandbox.internal.NashornSandboxImpl$1.invokeFunction(NashornSandboxImpl.java:353)
at org.thingsboard.server.service.script.AbstractNashornJsInvokeService.doInvokeFunction(AbstractNashornJsInvokeService.java:91)
at org.thingsboard.server.service.script.AbstractJsInvokeService.invokeFunction(AbstractJsInvokeService.java:51)
at org.thingsboard.server.service.script.RuleNodeJsScriptEngine.executeScript(RuleNodeJsScriptEngine.java:172)
at org.thingsboard.server.service.script.RuleNodeJsScriptEngine.executeUpdate(RuleNodeJsScriptEngine.java:104)
at org.thingsboard.rule.engine.transform.TbTransformMsgNode.lambda$transform$0(TbTransformMsgNode.java:52)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: java.lang.IllegalStateException: Executor thread not set after 100 ms
at delight.nashornsandbox.internal.ThreadMonitor.run(ThreadMonitor.java:130)
at delight.nashornsandbox.internal.JsEvaluator.runMonitor(JsEvaluator.java:47)
at delight.nashornsandbox.internal.NashornSandboxImpl.executeSandboxedOperation(NashornSandboxImpl.java:161)
at delight.nashornsandbox.internal.NashornSandboxImpl.access$000(NashornSandboxImpl.java:36)
at delight.nashornsandbox.internal.NashornSandboxImpl$1.invokeFunction(NashornSandboxImpl.java:349)
... 11 more
Caused by: java.lang.IllegalStateException: Executor thread not set after 100 ms
at delight.nashornsandbox.internal.ThreadMonitor.run(ThreadMonitor.java:91)
... 15 more

And later the script fails completely with the error,
javax.script.ScriptException: Script is blacklisted due to maximum error count 3!
at org.thingsboard.server.service.script.RuleNodeJsScriptEngine.executeScript(RuleNodeJsScriptEngine.java:178)
at org.thingsboard.server.service.script.RuleNodeJsScriptEngine.executeUpdate(RuleNodeJsScriptEngine.java:104)
at org.thingsboard.rule.engine.transform.TbTransformMsgNode.lambda$transform$0(TbTransformMsgNode.java:52)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

The only solution that I have found to get the script to run again is to restart the Thingsboard server.

I have looked in the logs on the server, but I couldn't see anything.

Proposed Solution:

Page to Update:
http://thingsboard.io/...

ThingsBoard Version: V2.2.0 running on Ubuntu 18.04.2.

@jmhernandez-circutor
Copy link

In my TB instance it occurs rarely, but when it happens there's no feedback. We cannot know the existence of a blacklisted script until something stops working.

Blacklisted scripts have occurred in integrations, and in several points in rule chains. We cannot flow error messages at each scripted rule node (we've a lot) and there's no way to do that in integrations.

There is a way to know if a script is blacklisted in the moment it happens?

@mkeevy
Copy link
Author

mkeevy commented Mar 6, 2019

@jmhernandez-circutor for me it happens less than 1% of the time. I would be happy to not blacklist the particular rule as it's not super critical. And having to restart the server seems a little harsh.

But it would be good to know when a script is blacklisted. Maybe someone knows how.

@gglemke
Copy link

gglemke commented May 7, 2019

I have alarms set which triggers when device is inactive for a specific period. When I receive such an alarm I investigate why, there could be a number of reasons but sometimes blacklist is the culprit. Main point is we know about it fairly quickly. I agree it would be nice if TB cld email tenant admin when a particular script has been blacklisted.

@efeeyuboglu
Copy link

I have the same problem. Something stops working even though I tested every single node. The bad thing is you never know which node got blacklisted because they all seem fine. It would be very nice if there was a way to set alarms/emails when a blacklisting occurs..

@Bliph
Copy link

Bliph commented Nov 5, 2019

Same issue here. Added logic to send email when scrips fail. This error occurs both for node types “Filter - script” and “Transformation – script”. The error started occuring after we deplayed the alarm system on our system. Maybe this caused a lot of extra load, and it also creates errors in the log around the sime time the scripts fail.

Logic to send email on script fail: [script node] ->(Failure link)->[Action - create alarm]->(Created link)-> [Transformatiin - to email] ->(Success link)->[External - send email]

Please see issue thingsboard/thingsboard#1274 for details on why scripts may fail.

There is also a good sumup in thingsboard/thingsboard#2085

Do any of you have answers to following questions?

  • How can I find the "root" error causing the script to fail? the error seems to be "lost" after 3 retries, and I cannot find any java-error in the log
  • Is there a way to extract the error text inside the rule node and add it to metadata (or something like that)?

@pineful
Copy link

pineful commented Dec 9, 2019

I am also experiencing blacklist problem on our thingsboard. I cannot catch what is the root cause of blacklisted javascript source code. I don't understand why it is not logged to file at all.
I suggest that it need some patch on 2.4.x version to log what was happened to be blacklisted.
Because it needs much time to apply v3.0 next year, we are experiencing now very seriously.
I am going to stop javascript vm sandbox (I really didn't want)

@hallard
Copy link

hallard commented Jun 17, 2020

Blacklist notification is a must and the way the counter is handled should be reviewed. I mean if 1% of my entries fails (because I do not always have hand on what I receive) it can happen but if after that, if next data received is correct then reset the counter, I can live with 1% data having issue (bad format, missing field, ...) but this 1% should not break the other 99% on the rule chain.
Blacklist should occurs on 3 (or more) rules chain script CONSECUTIVES errors, as soon as after an error, next pass is good, reset the error counter.

Anyway it's not the 1st time that this break and stop our rule chain, and device sends messages, so I can't check that on "no device communication".

So please give us a notification, log, whatever easy so we can get rid of this. it's not the 1st time my customer call me because it's not working anymore.

@Nathan-ma
Copy link

Same issue here. Added logic to send email when scrips fail. This error occurs both for node types “Filter - script” and “Transformation – script”. The error started occuring after we deplayed the alarm system on our system. Maybe this caused a lot of extra load, and it also creates errors in the log around the sime time the scripts fail.

Logic to send email on script fail: [script node] ->(Failure link)->[Action - create alarm]->(Created link)-> [Transformatiin - to email] ->(Success link)->[External - send email]

Please see issue thingsboard/thingsboard#1274 for details on why scripts may fail.

There is also a good sumup in thingsboard/thingsboard#2085

Do any of you have answers to following questions?

  • How can I find the "root" error causing the script to fail? the error seems to be "lost" after 3 retries, and I cannot find any java-error in the log
  • Is there a way to extract the error text inside the rule node and add it to metadata (or something like that)?

@Bliph, Great minds think alike I guess.
I'm currently facing the same issue, a few scripts are just failing and I have a bunch of them so it's hard to track them all. My solution was an alarm and log node for when they fail as well and I was also researching all the community places I could, hoping to find an answer on :

  • Is there a way to extract the error text inside the rule node and add it to metadata (or something like that)?

So I could include the error reason on the alarms.

Perhaps you already found a solution to this. care to share?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants