Farcry 6.2 Request Timeout Issue

Hi folks,

I got a strange request time out issue for my farcry 6.2 site, when it switched to connect to the production database server.
It was exactly the same code running on the same server but connecting to an internal SIT database server without any issue.

After the data source changed, now any page request will time out even if it only contains one line of cfoutput code.
It looked like a database issue to me initially so I created another test site (Non-farcry) with only one page request to the database I’m having problem with, and I can get the response back without issue.

From the debugger information I can see C:/wwwroot/mysite/farcry/core/tags/farcry/_farcryApplicationInit.cfm took long time to execute:
Total Time Avg Time Count Template

192387ms 192387ms 1 C:/wwwroot/mysite/farcry/core/tags/farcry/_farcryApplicationInit.cfm 141380ms 1285ms 110 C:/wwwroot/mysite/farcry/core/packages/fourq/_fourq/getData.cfm 42486ms 42486ms 1 CFC[C:/wwwroot/mysite/farcry/core/packages/security/security.cfc | init() ]

Also any query to the farPermission table took more than 5 seconds to execute.

The site is running farcry 6.2 on ColdFusion 11 update 6, Windows 2012 R2 standard server 64 bit. IIS 8.5.

Could anyone share some thoughts on this? I got no clue at the moment :frowning:

Thanks,
Xiaofeng

When you did your isolated test query, was it using the same DSN as your FarCry app, or a different one?

Did you check to see how long that individual query took? For example, was it < 10ms? When Core starts up it’s possible it needs to do several hundred queries so if you have any kind of unusual latency issues then over many hundred DB requests it could be grinding to a halt.

Is the new DB server on the same LAN segment as the previous DB server that was working? Is it somewhere further away?

I doubt it’s just a configuration setting on your app server, it does seem to be more like a networking issue, but it could also be very hard to diagnose without doing lots of different tests :smile:

Thanks Justin.

Yes the test query from the non-farcry app was connecting to the same DSN
as the farcry site. The isolated test was not very fast either especially
when I requested it for the first time, but eventually in about 10 seconds
it did return back the result. Subsequent requests complete instantly.

The new DB server is in DMZ. What actually happened is that the CF server
was in a local corporate network. And it was connecting to a database
server also on that network. I have done all my testing. Everything worked
fine so the CF server got moved into the DMZ, which is the same DMZ where
the production database server sits. The standard 1433 port for sql server
has been opened so that I can create the DSN in CF admin.

I also feel it got sth to do with network but it’s a bit hard to explain to
network guys because from their point of view, as long as I can telnet to
that port they’ve done their work.

What sort of testing I can do to figure this out?

It’s a tricky one, my initial thoughts were latency but that should be easy to test by doing a ping.

If its not latency… Is there any kind of network device in between the app server and DB server that does packet inspection? Those types of appliances could have a performance impact.

If the app server and DB server are both in the DMZ on the same network segment then I would expect the network performance to be fine.

The other things that you could check are network interface card drivers (this does happen from time to time, even on virtual servers), make sure there are no strange port settings on switches, no links between the devices that are saturated / at capacity, or even test if things other than database connections are slow (e.g file transfers).

Do you have another app server which can access the database server? (Maybe one that can tunnel through the firewall?) Or vice versa, another database server that you can restore the database on and test from the app server? Perhaps you can narrow down whether the app server or the DB server has the problem, or whether the problem is in between them…

I know that seems like a lot of generic advice but its the best I can offer without seeing the environment :slight_smile:

A couple of other easy things to check in the CF Administrator, make sure all debug settings are turned off as well as all the requesting monitoring stuff - the monitoring features are a production killer (at least in older versions of CF they were for sure).

I do have an old CF server (ColdFusion 8) sits within the same DMZ. And the old server connects to the same DB server without problem.

In fact, I’m updating / migrating a FarCry site from the old CF server to the new one (CF 11) which I currently have problem with.

So it looks like the DB server is not the problem I guess. It is either the CF server or the network.

I think I will try the request monitoring in CF admin to see if I can see anything…We don’t have a license for FusionReactor anymore…

Ok, so that narrows it down to the app server itself or it’s connection to the DB server.

I’m not sure if the CF request monitoring will give you much insight (it could infact be a cause) so I’d make sure all of the monitoring options are turned off completely.

I found there are many queries against farPermission table took about 5 seconds to retrieve just one record from the database. I can see there are lots of queries to farPermission table when the core starts up.

So I did a test from a sql client on the problematic CF server - running exactly the same query and it turned out to take 5 seconds to retrieve one record from the database.

Then on the old CF server (which I don’t have problem) I used the same sql client to run exactly the same query and it returned the result instantly - zero second.

The problem for the new server only happened after it got moved into DMZ.

I think I need to talk to the network guys.

Yeah, that rules out the CF engine and the application code if the queries are also slow using a SQL client… Glad to hear you got it narrowed down :smile:

Perhaps there’s either some networking configuration on the server that had to change after it moved into the DMZ, or it could be some other hardware (i.e. a switch) that it’s connected to…

Please let us know what you find!

Any luck tracking down the root cause?

I had a meetup with our network, database and server guys yesterday. We did
a bunch of testings together. We noticed that from the sql client on the
problematic server, even I ran a query to fetch 5000 records, it also takes
5 seconds, the same as just retrieve 1 record. And every subsequent query
also always take very consistent 5 seconds. So it’s not related to the data
size and they agreed it’s more a latency issue. So the network guy did some
tweaks. Sth related to reset some filters - I don’t know what exactly. Then
it fixed the problem of 5 seconds for the sql client. Now from the sql
client on both servers (old one without problem and new one with timeout
issue), I ran a query and I got pretty much the same response time.

But, when I ran my farcry site, I still got timeout. Then I switched my DSN
to the internal DB server. The site started successfully. Now, the weirdest
thing happened, I switched back the DSN to the DB server in DMZ, I did a
hard refresh and the site can be started but it is extremely slow. Just to
make sure I rebooted ColdFusion, then I got timeout again.

I’m thinking what would be the next best thing I can try.

Xiaofeng

So, the network or server guys “changed something” and now queries from the SQL client are fast but queries from CF are still slow?

Can you find out what they changed? It might be critical to getting queries in your app working, because performance from both places should be exactly the same…

What I do not understand now is I still see lots of queries to *farPermission
*table takes around 5 seconds to execute from the CF.

One of these is:

SELECT locked, aRoles, ObjectID, ownedby, lastupdatedby, label,
datetimecreated, title, lockedBy, datetimelastupdated, shortcut, createdby

FROM DBO.farPermission
WHERE ObjectID = ‘82C2B4FE-308B-11DF-B4080019BB28FF60’

From the same server, I ran this query from SQL client it now got the
result instantly (used to be very consistently 5 seconds for any query).

It now looks like a problem of the application.

This isn’t a problem in the application, these are normal queries that look up the permissions and then they get cached.

Your problem is that all queries are slow, so focusing on this specific query will not resolve the problem.

You said that queries in the SQL client were slow, and then your networks/server guys “changed something” and now the queries in the SQL client are fast. Can you please find out exactly what they changed? This could be critical to getting your app working.

When you connect to the DB via the SQL client;

  • are you using a hostname or an IP address?
  • are you using an instance name?
  • are you using a specific port number?

When you set up your DSN in CF are you using the exact same settings as above in the SQL client?

Like I said before, you really need to find out what changes were made which fixed the performance of the SQL client. If you can find out what those changes are, then perhaps we can work out how to fix it :wink: