MSSQL's openquery Saved Me
Posted by Brad Wood
Oct 04, 2008 11:34:00 UTC
Sorry I've been quiet for the past few days. My Flex adventures took a quick detour through the massive land of our AS400 and DB2. Now that I had a prototype of my pretty line, bar, and pie charts I needed some real data. My job was to write a process to fetch our sales data from the AS400 server here at work nightly and populate some SQL Server tables with it. Easy Peasy, I thought. I didn't know what I was in for.The first hurdle out of the gate was learning the structure of the files I was dealing with. (Apparently "tables" are called "files" in AS400 speak) I've been quite spoiled with extremely long and (mostly) descriptive table names in the SQL Server world. On the '400 however, file names are limited to only 10 characters. This leads to VERY creative abbreviations which make perfect sense to all the guys who have worked here for 10-20 years. :) Once I had the data figured out, I was on my way again. We have a linked server set up in MS SQL 2000 which connects to the database on the '400 as a standard ODBC data source. My queries were running a little less than fast. In fact, they were dog slow. It would take anywhere from 20 to 30 minutes to return about 1000 records from 8 tables! I started digging through my execution plans and was horrified to find that SQL Server was bringing over the ENTIRE contents of each table and performing all the joins and where clauses locally. That was over 9 Million records! To prove it, I tested a simple select like:
[code]select * from table where column = 'value'[/code]Sure enough, the entire contents of the table (millions) would get transferred over the network (taking 5-10 minutes), and THEN the where clause would get applied and the handful of records I wanted where returned. This tech note from Microsoft discusses the many reasons why SQL Server may not trust a remote data source to do its own processing. This ranges from data types in use, having a top clause, and compatible collations. My solution was simple and effective. It was the openquery() function that SQL Server supports. It "executes the specified pass-through query on the specified linked server." That basically means I can specify the AS400 linked server and directly pass it a string of SQL for it to process and hand back the results. Now, this SQL string must be SQL in the native dialect of your linked server. For instance, I have to pass a DB2 formatted query. Furthermore, the openquery() function has a really annoying "feature" of NOT accepting variables for an input. I'm not sure what Microsoft programmer fell asleep at his desk the day he was supposed to program that part. Regardless, this is what the code looks like now:
[code]SELECT * FROM OPENQUERY(as400, ' SELECT * FROM table WHERE column = ''value'' ')[/code]Biggest downside-- there is no code formatting to the SQL being passed in but hey, it's just a string that doesn't mean any thing to SQL Server, right? Biggest benefit-- it runs about 300x faster! This is because the remote server is doing all the joining, indexes and filtering and only passing back what matters to SQL Server.
Tags: AS400, Performance, SQL
Adrian J. Moreno
Brad, this is a great thing to know. My previous employer uses DB2 and man, I had no idea what I was walking into over there. You get 10 characters? Maybe you have more advanced green screens over there. We only got 8.
The COBOL programmers could tell you the name of any "file" off the top of their heads, as well as what it stored: "What's ZDTPW03C? Oh come on, the system wouldn't work without ZDTPW03C."
I'm sending this to some of my former coworkers. There are a couple of processes that require data from DB2 to be ported over to SQL Server and they take forever.
I love using OpenQuery for directly querying Excel files. It is definitely a handy function.
@Adrian: I believe the limit was originally 8 characters. At some point they made the big upgrade to 10! Our AS400 is version 5 Release 4.
Are you using ColdFusion? You can always add a JDBC driver to your ColdFusion installation (even in CF Standard).
The driver can be found in the Java/JTOpen Toolbox - http://sourceforge.net/projects/jt400
For AS400 running on the iSeries, I used the jt400Native.jar for the JDBC driver. You can then write sql statements in the form of: SELECT columns FROM library.file
We also wrote some programs with RPGLESQL and registered them as external stored procedures in the iSeries Navigator and used those from ColdFusion too.
And, in the end, it was all for a Flex application.
I'm assusming there's enough interest in this stuff and I'll post an article on it next week.
@Dutch: I am interested in the JDBC drivers for the AS400. Right now, none of our ColdFusion servers directly access the data from the 400. I am to copy the data for my Flex app over nightly into a MS SQL server database. So, CF will not be involved at all in the copying of the data.
Can you get around the 'no variables input' hurdle by dynamically building your openquery command instead? Build your statement and do this: EXEC sp_executesql @sqlStmt
@Brad I'll work on an article outlining the steps to take to making a connection to AS/400 with a JDBC driver.
We've been able to write queries against library files and also been able to execute stored procedures written in RPGLESQL (which included some heavy business logic we didn't want in our flex app or on the CF side). You're right about the column and file names, they're cryptic.
It all runs blazingly fast too.
@Mark: I have been meaning to try that. I haven't yet though.
@Dutch: That article sounds great. Let us know when it is finished.
currently i'm using it to interface with peoplesoft. word of advice, you might want to think about containing your openquery statements in a view. it make it alot easier when you have to inserts or updates.
I have several posts on my blog about using JTOpen to connect to to db2/400 from ColdFusion that you might find useful. If you need/want help getting the driver setup (it's super simple), let me know.
While the linked server in SQL Server is an option, I prefer not to have to make copies of data when I don't need to.
Of course, it would be helpful if I gave you the URL to the posts I'm talking about. Doh!
@Tony: Are you saying I make a view that does "select columns from table" via openquery and then I can update and insert into that view? I wouldn't think that would work, but it sounds like that is what you are saying.
@Rob: I'm with you on not wanting to have extra copies of the data. Unfortunately, my hands are kinda' tied on that since I don't have any control over the AS400 here at work. I think their main concern may be the performance load implications of tying the website directly to 400, but I'm not sure. A lot of it may just be related to the "this is how we've always done it" kind of thing.
Thanks for the link.
If I remember correctly, something like this should work for variables:
declare @myvar varchar(8)
set @myvar = 'value'
SELECT * FROM OPENQUERY(as400, ' SELECT * FROM table WHERE column = ''' + @myvar + ''' ')
Using the extra ' (single quote) before and after the variable.
@Mark: Using exec does work.
declare @sql as varchar(200) set @sql = ' SELECT * FROM OPENQUERY(as400, '' SELECT * FROM table '')' exec(@sql)
@dattac: The method you showed does not work. openquery appears to not accept variables OR concatenations of variables and strings. In fact, it doesn't even allow for two strings concatenated together.
Ahhh yes, using a combination or EXEC and variables is how I use to accomplish this:
declare @myvar varchar(8), @sql varchar(2000)
set @myvar = 'value'
set @sql = 'SELECT * FROM OPENQUERY(as400, '' SELECT * FROM table WHERE column = ''''' + @myvar + ''''' '')'
Or something similar to that (I'll have to check on it when I get home, it's been a while since I have used it). Again, just need to be careful with escaping all the single quotes.
Yep it will, at least for oracle and peoplesoft. i guess it depends if the database you're connecting to supports it.
This is the bomb... It sped up my query by 1000 percent. Thank you, Thank you, Thank you...
Hi champs, can any one tell me that how can i use OPENQUERY for inserting Records in the (linked Server) Database. Currently i want to add new record in (*.fdb) Navision Financial database. Actually i want any example which clear me the syntax of that query. Please help,
Thank you, Adeel Shafqat
@Adeel: I have never been able to do inserts with open query since my AS400 doesn't use primary keys and ODBC can't identify unique records.
As far as what the syntax should be, just look to Microsoft's MSDN pages: http://msdn.microsoft.com/en-us/library/ms188427.aspx
INSERT OPENQUERY (OracleSvr, 'SELECT name FROM joe.titles') VALUES ('NewTitle');
I have a big problem with Fujitsu Cobol, MS SQL Server and OPENQUERY command. This command is working: EXEC SQL SELECT St_code into :state
FROM OPENQUERY ([new-server], 'SELECT St_code from my_datatabes..USER_DATA where user = 1233321' ) END-EXEC
But I want use a variable instead of hard coded value 1233321. I tried: EXEC SQL SELECT St_code into :state
FROM OPENQUERY ([new-server], 'SELECT St_code from my_datatabes..USER_DATA where user = ' :user ) END-EXEC or move spaces to ws-command. string "'SELECT St_code from my_datatabes..USER_DATA where user = ' ws-user "'"
delimited by size into ws-command.
005830 EXEC SQL 005840 SELECT Ste_cde into :DL-ACCT-state
FROM OPENQUERY ([rnt-w2k8devdb], :ws-command) 003890 END-EXEC
and some other combinations with multiple quites - the message from SQL Server Profiler I was getting that @P1 parm was invalid, but it looks to me that OPENQUERY accepts only a hardoced string from Cobol source program and no parm is allowed. I hope I am wrong.
No, that's how it works unfortunately. You can type in exact text at parse time. No variables, no concatenation.
If you are calling this from an application server such as ColdFusion, you can generate the openQuery on the app server and pass to the DB. Your other option is to concatenate the entire openQuery() statement in SQL server and execute it with exec(). You'll need to use a temp table declared prior to the exec to put the results in since exex() operates in its own scope.
It's messy to read, but I have done that many times.