Websocket issues - hanging server

So I’m running the websockets extension. After a few days, it starts giving the following error message and the server processing of requests slows down, eventually requiring a restart of the server. Any thoughts on what could be going on?

Version	Lucee 5.2.2.4-SNAPSHOT
Version Name	Velvet
Release date	May 8, 2017
ColdFusion® compatibility version	2016.0.03.300357
Configuration File	/usr/local/apache-tomcat-8.0.28/webapps/lucee-5.1.0.34/WEB-INF/lucee-server/context/lucee-server.xml
OS	Mac OS X (10.12.5) 64bit
Remote IP	0:0:0:0:0:0:0:1
Host Name	localhost
Servlet Container	Apache Tomcat/8.0.28
Java	1.8.0_51 (Oracle Corporation) 64bit
Architecture	64bit
"ERROR","Thread-5667","05/18/2017","17:30:10","",";The remote endpoint was in state [TEXT_FULL_WRITING] which is an invalid state for called method;lucee.runtime.exp.NativeException: The remote endpoint was in state [TEXT_FULL_WRITING] which is an invalid state for called method
        at org.apache.tomcat.websocket.WsRemoteEndpointImplBase$StateMachine.checkState(WsRemoteEndpointImplBase.java:1148)
        at org.apache.tomcat.websocket.WsRemoteEndpointImplBase$StateMachine.textStart(WsRemoteEndpointImplBase.java:1111)
        at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendString(WsRemoteEndpointImplBase.java:192)
        at org.apache.tomcat.websocket.WsRemoteEndpointBasic.sendText(WsRemoteEndpointBasic.java:37)
        at net.twentyonesolutions.lucee.websocket.WebSocket.sendText(WebSocket.java:391)
        at net.twentyonesolutions.lucee.websocket.connections.ConnectionManager.broadcast(ConnectionManager.java:175)
        at net.twentyonesolutions.lucee.websocket.connections.ConnectionManager.call(ConnectionManager.java:398)
        at lucee.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:758)
        at lucee.runtime.PageContextImpl.getFunction(PageContextImpl.java:1701)
        at server.mixins.websocketsmanager_cfc$cf.udfCall(/orrms/server/mixins/WebSocketsManager.cfc:19)
        at lucee.runtime.type.UDFImpl.implementation(UDFImpl.java:106)
        at lucee.runtime.type.UDFImpl._call(UDFImpl.java:338)
        at lucee.runtime.type.UDFImpl.call(UDFImpl.java:225)
        at lucee.runtime.ComponentImpl._call(ComponentImpl.java:697)
        at lucee.runtime.ComponentImpl._call(ComponentImpl.java:580)
        at lucee.runtime.ComponentImpl.call(ComponentImpl.java:1918)
        at lucee.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:758)
        at lucee.runtime.PageContextImpl.getFunction(PageContextImpl.java:1701)
        at pagingmember_cfc$cf.udfCall(/resources/pagingMember.cfc:71)
        at lucee.runtime.type.UDFImpl.implementation(UDFImpl.java:106)
        at lucee.runtime.type.UDFImpl._call(UDFImpl.java:338)
        at lucee.runtime.type.UDFImpl.callWithNamedValues(UDFImpl.java:211)
        at lucee.runtime.ComponentImpl._call(ComponentImpl.java:698)
        at lucee.runtime.ComponentImpl._call(ComponentImpl.java:580)
        at lucee.runtime.ComponentImpl.callWithNamedValues(ComponentImpl.java:1931)
        at lucee.runtime.tag.Invoke.doComponent(Invoke.java:221)
        at lucee.runtime.tag.Invoke.doEndTag(Invoke.java:194)
        at taffy.core.api_cfc$cf.udfCall1(/orrmsapiv1/taffy/core/api.cfc:337)
        at taffy.core.api_cfc$cf.udfCall(/orrmsapiv1/taffy/core/api.cfc)
        at lucee.runtime.type.UDFImpl.implementation(UDFImpl.java:106)
        at lucee.runtime.type.UDFImpl._call(UDFImpl.java:338)
        at lucee.runtime.type.UDFImpl.call(UDFImpl.java:225)
        at lucee.runtime.ComponentImpl._call(ComponentImpl.java:697)
        at lucee.runtime.ComponentImpl._call(ComponentImpl.java:580)
        at lucee.runtime.ComponentImpl.call(ComponentImpl.java:1918)
        at lucee.runtime.listener.ModernAppListener.call(ModernAppListener.java:420)
        at lucee.runtime.listener.ModernAppListener._onRequest(ModernAppListener.java:222)
        at lucee.runtime.listener.MixedAppListener.onRequest(MixedAppListener.java:43)
        at lucee.runtime.PageContextImpl.execute(PageContextImpl.java:2408)
        at lucee.runtime.PageContextImpl._execute(PageContextImpl.java:2398)
        at lucee.runtime.PageContextImpl.executeCFML(PageContextImpl.java:2366)
        at lucee.runtime.engine.Request.run(Request.java:34)
Caused by: java.lang.IllegalStateException: The remote endpoint was in state [TEXT_FULL_WRITING] which is an invalid state for called method
        ... 42 more

Please open a ticket at https://github.com/isapir/lucee-websocket/issues and add as many details as possible, including some code snippets of how you use the extension.

I will look at it as soon as I can.

1 Like

Just as a follow up: I solved this problem by using an exclusive lock around connectionManager.broadcast(…). Should such a lock be within the WS extension itself so it doesn’t try to write to the socket when it’s already writing?

possibly. I saw somewhere that this is a synchronization issue, but I asked you to open a ticket and you didn’t do that so I didn’t pursue looking into it since you were the only one who reported it.

1 Like

Added as https://github.com/isapir/lucee-websocket/issues/6

FYI: This issue has been fixed in version 2.0 of the extension, but the setup of the extension has changed because of some other required changes, so please see README.md at https://github.com/isapir/lucee-websocket

Re-starting this thread! In 2017, I ended up not using WebSockets since didn’t have time to work on issues we were having with stability but now need to revisit using Lucee WebSockets for our app.

After installation of stack below and getting WebSockets running properly, I’m having occassional issues where a client appears to initally connect but no longer seems to receive messages from server. My working theory is that the client got a valid, working WebSocket connection on the web app load the first time, then due to network issue, dropped the connection midstream, and every subsequent re-load of the web app results in no messages being received by that client. This is only solved by restarting the Lucee server application.

In the websocket log, I’m getting the following exception:

The remote endpoint was in state [TEXT_FULL_WRITING] which is an invalid state for called method

Per my post from 2017, I already use a named application lock of 10 seconds, but once the above starts to happen, I seem to get timeouts of the application lock.

I also see the following two exceptions:

“DEBUG”,“http-nio-8888-exec-9”,“DATE”,“TIME”,“websocket”,“connection 8f enter onClose(); CloseReason: code [1001], reason [null]”

An existing connection was forcibly closed by the remote host.

So my questions:

  1. Anyone else encountered this issue and any thoughts on what’s happening / how to solve?
  2. In my review online, I found the following resources re: Tomcat:
    63931 – The remote endpoint was in state [TEXT_FULL_WRITING] which is an invalid state (apache.org)
    Blocking IO does not appear to always be blocking with asyncIO - Mark Thomas - org.apache.tomcat.dev - MarkMail
    a. I see my version of Tomcat is 9.0.24. Any easy way to update current Lucee server to it, or will I need to install Tomcat first and then run Lucee as a servlet in it? (would prefer not to do if I can avoid it!)
  3. Wonder if I’m encountering the similar / same issue as Websocket connection mgr broadcast() exception on closed connections - hacking / extensions - Lucee Dev

Thanks much! FYI, I’m working building on / compiling the documentation and thread answers by @isapir @andreas @martin and will make it publically available for all / git merge request to official git for lucee-websockets.

Regards,
Cage


System Information

Version Lucee 5.3.8.206
Version Name Gelert
Release date Sep 24, 2021


Servlet Container Apache Tomcat/9.0.24
Java 11.0.4 (AdoptOpenJDK) 64bit
OS Windows Server 2016 (10.0) 64bit

I have no clue on this :confused: … I have to admit that i never implemented websocket for production, just for some testing. Are you able to reproduce this issue? What exactly happens on the client side?

I’ve read somewhere that one solution was using another servlet engine (e.g. jetty). If you know how to reproduce this issue, I’d quickly spin Lucee with an Undertow instance with CommandBox to check and test if you get the same stack trace message.

This may or may not be related, but I had websockets running fairly well until I upgraded to Lucee 5.3.8.206 with a clean install, which uses Tomcat 9.0.46. Now my websockets are unstable… they work for a while and then they’ll stop working and all connections fail. I do not see the log error that includes “TEXT_FULL_WRITING”, but I do get a log error

"Failed to register endpoint /ws/products/{channel}: Deployment of WebSocket Endpoints to the web application with path [] in host [Catalina/local.mysite.com] is not permitted due to the failure of a previous deployment"

which I think is because when I reset the app, I call WebsocketRegister() again. Restarting Lucee resolves the issue.

I created a new issue on @isapir github: Issues with Lucee 5.3.8.206 and Tomcat 9.0.46 · Issue #26 · isapir/lucee-websocket · GitHub

1 Like

Grrh, this problem is really killing things with WebSockets. Things run fine for a while and then these errors accumulate, and then all the clients stop receiving messages.

The remote endpoint was in state [TEXT_FULL_WRITING] which is an invalid state for called method;

Unfortunately, Google isn’t very helpful as to how to deal with these errors.

Am I supposed to do something with the onError handler in Lucee WebSockets to gracefully deal with this issue? And if so, what exactly?!

System Information

Version Lucee 5.3.8.206
Version Name Gelert
Release date Sep 24, 2021

Servlet Container Apache Tomcat/9.0.24
Java 11.0.4 (AdoptOpenJDK) 64bit
OS Windows Server 2016 (10.0) 64bit

Upgrade your version of tomcat

This was fixed later versions of tomcat
https://bz.apache.org/bugzilla/show_bug.cgi?id=63931

Okay thanks, I’ll try again. I attempted to install Tomcat + Lucee in the context of IIS a few weeks back and, unfortunately, I couldn’t get URL rewriting & mappings for client/server app to work correctly :frowning:

Tomcat 9 destabilized an already fragile implementation of the websocket extension, FYI. Setting up websockets to run on IIS has never been simple. This is the thread you need for that: Troubleshooting Lucee Websockets with IIS and ARR

Thanks, I did already find / follow that thread and it got me up and running with WS. The issue at this point is the error message mentioned above and the fact that it crashes under heavy load. I will try to upgrade Tomcat to see if that resolves. If not, I think the following StackOverflow javascript - WebSocket Closes with Protocol Error 1002 - Stack Overflow may have a solution. Unfortunately, I’m not a Java programmer so I wouldn’t be able to update the Lucee WebSockets Extension code to see if it works.

I created an issue… vote it up and follow it: [LDEV-3815] Provide robust websocket support - Lucee

1 Like

So I updated Tomcat to 9.0.56 and, sadly, continue to have issues. Error on “unable to broadcast to closed connection” and then Websockets hangs until Lucee is restarted :frowning:

@Redtopia I have had great success using the proxy system that I tried to explain in this thread:

This allows me to edit the websocket listener and reload it without having to restart the server. This is a great help during development.

@martin yea, your solution looks good. I would have to refactor a bunch of code to make it work like that, but thanks for the tip.