ColdFusion's Catch Struct Not Locally Varred?

ColdFusion's Catch Struct Not Locally Varred?

Posted by Brad Wood
Jan 19, 2011 05:05:00 UTC
When catching an error in a CFC, I've always assumed that the exception object was a locally scoped variable, specific to that method call. Some interesting errors we received from the depths of our ColdBox framework made me start to question that. I concocted a test last night which appears to prove that your exceptions are not thread safe in a CFC stored in a persistent scope. (In CF8, at least)

Background

When you use the try/catch construct in tags, your exception object will always be a variable called "cfcatch".
[code]
<cftry>
	<cfset t = r>
	<cfcatch>
		#cfcatch.message#
	</cfcatch>
</cftry>
[/code]
When you use try/catch in cfscript you're allowed to specify a name for the exception object. In CF8, these must be simple names (meaning you can't do "local.err") and you aren't allowed to var the variable at the top of your function.
[code]
<cfscript>
	try
		{
			t = r;
		}
	catch(Any err)
		{
			writeOutput(err.message);
		}

</cfscript>
[/code]

The Problem

So, starting a few days after releasing some new code to production, we would occasionally see the following errors come to the logs (where "e" was the name of our exception object):
  • Element MESSAGE is undefined in E.
  • Element DETAIL is undefined in E.
  • Element STACKTRACE is undefined in E.
Adobe guarantees those properties to be defined in your exception object. The try/catch in question was a component of the ColdBox framework which is placed in the application scope, and can be hit by multiple threads at the same time. Long story short, I found that an error had been happening in the "try" block EVERY time it was called. This means that the "catch" block was getting called a lot-- and sometimes by multiple requests at the same time. This was the perfect setup for a race-condition and it looked like the exception object was not being treated as a local variable to that method call, but was being shared for all processes currently executing that method. Here is the simple test I devised to show the issue.

First Test

First, I created a simple component with a method that causes an error and catches it. I actually created two methods. One in cf tags, and one in script. I tested both with the same result, but for the sake of simplicity I'll only show the cfscript version here. test.cfc
[code]
<cfcomponent>

	<cffunction name="doError">	
		<cfscript>
			var local = {};
			try
				{
					t = r;
				}
			catch(Any err)
				{
					local.message = err.message;
					local.detail = err.detail;
					local.stackTrace = err.stackTrace;
				}
		</cfscript>	
	</cffunction>
	
</cfcomponent>
[/code]
As you can see, "r" isn't defined and will throw an error. The catch block simply accesses a few of the exception object's (err) properties to verify they are there. Then I set up the following page to create my CFC, and call it's doError method a bunch of times. index.cfm
[code]
<cfparam name="url.reinit" default="0">
  
<cfapplication name="test">
 
<cfif url.reinit>
	<cfset application.test = createObject("component","test")>
<cfelse>
	<cfloop from="1" to="500" index="i">
		<cfset application.test.doError()>
	</cfloop>
</cfif>
[/code]
Hitting my page with ?reinit=1 in the URL would create a new instance of my "test" component. Hitting my page without the URL parameter called the doError() method 500 times. I opened up two tabs with the index.cfm page without the URL parameter and would refresh one right after the other. I can't figure out why, but I could produce an error easily the first time. After that, it seemed impossible to produce the error again, until I added the reinit flag in to re-create the CFC. Then, on the next run, the error would usually occur in one or both tabs. Either "message" or "detail" or "stacktrace" would be undefined in my exception object. I assume the inconsistencies arose from internal timing differences after the method had been called a couple times. Either way, this showed the exception object could be easily corrupted by parallel threads also catching errors in that CFC.

Second Test

Now I wanted to take it up a notch. My first test showed that the exception object was getting corrupted by two separate requests both throwing the same error over and over again in a method. Now I wanted to see if the actual error details were leaking across from one request to the other as well. I modified my test slightly to throw a different (but predictable) error every time. I changed the loop in index.cfm to start at a configurable index, and pass that index into the doError method:
[code]
<cfparam name="url.reinit" default="0">
<cfparam name="url.start" default="0">
 
<cfapplication name="test">
 
<cfif url.reinit>

	<cfset application.test = createObject("component","test")>

<cfelse>
	 
	<cfloop from="#1+url.start#" to="#500+url.start#" index="i">
	  <cfset application.test.doError(i)>
	</cfloop>
	
</cfif>
[/code]
My doError() method was changed to accept that ever-incrementing index as an argument and change the non-existent variable it tried to accessed based in that input. Next, the "catch" block compares the message in the error struct to see if it matches the missing variable name that it SHOULD have errored on.
[code]
	<cffunction name="doError">
		<cfargument name="index">
	
		<cfscript>
			var local = {};
			try
				{
					t = local[arguments.index];
				}
			catch(Any err)
				{
					local.message = err.message;
					local.stacktrace = err.stackTrace;
					if(not local.message contains "Element #arguments.index# ")
						{
							writeOutput("'#left(local.message,11)#' neq 'Element #arguments.index# '!<br>");
						}
				}
		
		</cfscript>
	
	</cffunction>
[/code]
One tab I ran normally which try/caught 500 errors like so:
  • Element 1 is undefined in a CFML structure referenced as part of an expression.
  • Element 2 is undefined in a CFML structure referenced as part of an expression.
  • Element 3 is undefined in a CFML structure referenced as part of an expression.
  • ...
  • Element 500 is undefined in a CFML structure referenced as part of an expression.
The second tab I added ?start=500 to so it try/caught 500 errors like so:
  • Element 501 is undefined in a CFML structure referenced as part of an expression.
  • Element 502 is undefined in a CFML structure referenced as part of an expression.
  • Element 503 is undefined in a CFML structure referenced as part of an expression.
  • ...
  • Element 1000 is undefined in a CFML structure referenced as part of an expression.
I did that so it would be obvious which missing variable names were from each request. After reinitting my new object, I refreshed both tabs and found it would range from 0 to 100 conflicts where the exception object's message was NOT equal to the error that I know was thrown. Output would look something like so:
[code]
'Element 167' neq 'Element 506 '!
'Element 184' neq 'Element 524 '!
'Element 185' neq 'Element 525 '!
'Element 186' neq 'Element 526 '!
'Element 187' neq 'Element 527 '!
'Element 190' neq 'Element 530 '!
'Element 191' neq 'Element 531 '!
'Element 192' neq 'Element 532 '!
'Element 193' neq 'Element 533 '!
'Element 194' neq 'Element 534 '!
'Element 205' neq 'Element 544 '!
'Element 206' neq 'Element 545 '!
'Element 207' neq 'Element 546 '!
'Element 208' neq 'Element 547 '!
etc...
[/code]
Each line in the output represents an instance of the catch message "bleeding over" from the other thread. I played around with different settings, and the best combination to reproduce was with debugging off, trusted cache on, and with the "local.stacktrace = err.stackTrace;" line in the "catch" block. I assume it all has to do with the internal timing of each thread and their likelihood to overlap.

Conclusion(s)

  1. ColdFusion 8's exception object does not appear to be thread safe. (I tested on a 32-bit Windows server with a stand-alone install.)
  2. Since errors being caught in CFCs stored in persistent scopes happening in one or more requests at the exact same instant are probably fairly uncommon, I suppose I can understand why most people have never run into this scenario.
  3. The errors being thrown can collide even they they are in two DIFFERENT methods in the CFC.
  4. Tag-based cftry and cfcatch will interfere with any other tag-based cftry and cfcatch in that component since the exception struct is always called "cfcatch".
  5. Cfscript-based try and catch will ONLY interfere with other cfscript-based try and catch in that component if they use the same variable name for the exception object.
  6. Googling for this problem will return you a great big pile of nothing.
  7. Adam Cameron logged a bug for this in the Adobe CF Bug database last July under ColdFusion 9.
  8. I won't be entering a new bug, but instead voting for his. You should vote for it too.
  9. Why the heck doesn't Google index the contents of the Bug Database?

 


Seth Feldkamp

He's back! Nice work as always Brad. Very interesting stuff. Voting for this bug now.

So I guess the take away for working around this would be to use cfscript try / catch for non-threadsafe objects and name the catch variables uniquely.

Brad Wood

@Seth, the unique name trick might work, but it would have to be different for EVERY request and I'd have to see if cfscript will let you declare the exception object's name dynamically at run time.

Mark Andrachek

Oh good grief - good catch. I would consider this not just a bug, but a security issue as well (possible information leakage as a result).

The bug db, is apparently built entirely in flash, which is why google doesn't index it.

Raymond Camden

Check out Elliott Sprehn's port of the bug tracker:

http://www.elliottsprehn.com/cfbugs/

Brad Wood

@Mark: Adobe/Google supposedly "fixed" the indexing flash problem back in 2008: http://googlewebmastercentral.blogspot.com/2008/06/improved-flash-indexing.html I remember Ben Forta announcing that Google would be able to pull text from dynamic, DB-driven Flash sites.

@Ray, thanks for the link to Elliot's bug tracker port. Do you know how he got the data for it? Did he somehow scrape the site himself, or did someone on the "inside" give him access to the database? I know you helped develop it, but kind of assumed Adobe handled all the hosting and such on their own.

Raymond Camden

Well I can't speak for how Elliott did it, but I can speak to how I would do it. Flash Remoting is just a network protocol. I'd fire up Charles and watch the calls.

Elliott Sprehn

@Brad

My tracker speaks directly to the Adobe tracker's Flash Remoting service. That's what makes it a "live" mirror. Ray's description is exactly how I managed that.

Site Updates

Entry Comments

Entries Search