Monday, February 20, 2012

old ASP application with new connectivity problems

Hello,

we're trying to brain out a new problem with an old application and I thought this might be the best place to get the help.

The application has been running fine since 2001. The original system configuration was a web machine connecting to a DB machine. Both systems ran Win2k and the original DB was on SQL2k. My company has licensed the system from the original authors and we've redeployed it onto new machines.

The new system configuration is a web machine connecting to a DB machine. The web machine is win2k3r2 32-bit and the DB machine is win2k3r2 64-bit. The new DB is on SQL 2k5. There are also some network design differences (the web and DB machine talk over a dedicated backnet), but we're not seeing any connectivity problems there at all.

Before I describe the current issues, please note... if we run the SQL Server and webserver on the same machine, this issue disappears. This issue also never once appeared on the older/slower deployment from 6 years ago.

We have a chunk of code (that I could post, but it would be a waste of time, it's VAST) that grabs a recordset through OLEDB and then loops through it, each time making another new connection to the DB machine. which probably opens and closes a few other connections each time, too. Loops within loops within loops, all opening multiple connections. (like I said, this has worked for 6 years on older equipment and software).

We now get the following error:

Microsoft OLE DB Provider for SQL Server error '80004005'

[DBNETLIB][ConnectionOpen (Connect()).]SQL Server does not exist or access denied.

/faculty/Portals/FacultyPortal.asp, line 169

line 169 (and the one before it for clarity) look like this:

aszSQL = "select query from query_builder where ID = " & teamID
aobRS.open aszSQL, cnnTask, adOpenForwardOnly, adLockReadOnly

cnnTask is opened earlier in the code and is repeatedly used for several different purposes. All connections are left open until the end of the ASP page where they are closed.

This is not where the application dies each time, but it always dies in the inner-most loop during a similar RS.open statement.

The connection string we are using to access the DB is:

application("gszConnectionString") = "Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=demo;Data Source=ppdb1-data"

Here are some things we've tried and some results:

1 - we called dell and had then try to help us, thinking it was a network connectivity problem. They ran packet sniffers and stuff and declared that we should update MDAC. This did not solve the problem.

2 - both the web and the DB machine were using Intel ProSet/1000 adapters with load-balanced teamed ports. The teams were dissolved. This did not solve the problem.

3 - I tried changing 'Persist Security Info=False' to True. This didn't solve the problem, but did change the error message to Timeout() (instead of does not exist or can't login).

4 - We've tried fiddling with SQL Server's TCP properties (like keep alives) with no luck.

5 - In the course of testing, we loaded our copy of SQL Server onto the web machine and were, initially, able to reproduce the problem on a single machine. using almost the same connection string (but with 127.0.0.1 in the data source name). When we later changed the actual IP address to a resolvable name (ie. localhost), it stopped. <boggle>

6 - we've tried SQL logins instead of integrated security. this didn't help either.

We're at a total loss for an explanation and, apparently, so is Dell. Not surprising, I guess, but this is particularly vexing for us because it's all newer faster equipment but produces problems we didn't see on the old deployment.

Does anyone have any other suggestions of things to try before we give up and start a rewrite in .NET? (which I am loathe to invest in at this point)

Thanks,

Ryan

You are getting the double hop problem it is a known issue to developers, the link below is from Microsoft support there are two more links relevant to you at the end of this link. Hope this helps.

http://support.microsoft.com/kb/810572/

|||Caddre thanks for the quick reply, I love getting community help! Smile

Hmmm... that particular article doesn't seem to apply to this case, this is an ASP 3.0 application (vb script, not a .NET application).

Also, the other articles at the bottom contain information about how to configure asp applications to use SQL Server on a remote machine. This is working for the most part.

Maybe I wasn't clear enough, but, We're having a failure after some time in the process... Like, it takes 15-20 seconds for the failure to occurr... when the code works (when using a DB loaded onto the same machine) it takes 45 seconds or so to execute.

We were fooling around with it further this afternoon and made another discovery. It has something to do with TCP connections. Sure, duh, but, it's confirmed now. Here's what we did:

1 - loaded the app and SQL onto the web server. the connection string is as above, but with the data source=ppWeb1.

2 - Tested the application, everything works including the long query (if we look in SQL config manager, we find that the default setting of shared memory is still the preferred connection method).

3 - change the connection string (still using the DB on the same machine) to say data source=tcpStick out tonguepWeb1,1433

4 - test the application again and everything still works except the long queries.

so... without a doubt, it's only a problem for TCP connections... even if they never leave the machine (ie. never hit the PHY layer of the network stack).

<boggle>?

Any one else have any suggestions, please?

Thanks,

Ryan
|||

Then you need to use the surface area configuration tool to allow both local and remote and enable both TCP/IP and Named Pipes.

|||heya Caddre,

Thanks for the help, that wasn't it either, but, I was going to post here an extended revelation of what *WAS* the problem and how I fixed it this morning... *grin*

Ok, first thing I did to figure this out was fool with the connection string... this didn't fix it, but it did point me in the right direction...

ORIGINAL CONNECTION STRING -
Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=demo;Data Source=ppdb1-data

NEW CONNECTION STRING -
Provider=SQLNCLI;Trusted_Connection=yes;Server=ppDB1-data;Database=ULL

This still didn't work, but it produced an error we'd never seen before... something about the Named Pipes provider being unavailable. Now, this is after I had used the surface area configuration tool to turn on named pipes (did some unsuccessful testing) and then turned it back off. So, the web server was still trying to default to named pipes. Changed the connection string again...

NEW CONNECTION STRING -
Provider=SQLNCLI;Trusted_Connection=yes;Server=ppDB1-data;Database=ULL

NEW NEW CONNECTION STRING -
Provider=SQLNCLI;Trusted_Connection=yes;Server=tcpStick out tonguepDB1-data,1433;Database=ULL

Forcing TCP connections... this produced the same error as before, but FINALLY SQLNCLI produced an error message that was helpful! The error was:

Only one usage of each socket address (protocol/network address/port) is normally permitted

AHA! So a quick google of this error yields the following MSDN blog entry here:

Durgaprasad Gorti's WebLog

The problem was that the new machines were SO MUCH faster than the old machines (remember, there's 6 years separation between the old and new servers) that the new machines were exhausting the 5000 open TCP ports in seconds and windows wasn't releasing them for a further 240 seconds!

I changed 2 registry key entries to fix this. If you go into:

HKLM\System\CurrentControlSet\Services\Tcpip\Parameters

in the registry and add a DWORD for MaxUserPort and set it to (decimal) 65534, it raises the number of available ports for transmission.

Then also add another DWORD for TCPTimedWaitDelay and set it to (decimal) 30, it lowers the recycle time for ports from 4 minutes to 30 seconds.

I set both of these registry keys on the DB Server and on the Web server and TAHDAAHHHHH! <happy dance>

We've been braining on this issue for a long time now and your help was valuable in getting us to try the right kinds of things that led to the actual answer. Smile

Thanks so much and I hope this solution helps other people, since it wasn't very easy to find.

Take care,

Ryan

No comments:

Post a Comment