Content RSS Feed by ContentBox

ContentBox RSS Feed https://www.codersrevolution.com Content RSS Feed by ContentBox Ortus Solutions, Corp (www.ortussolutions.com) http://www.rssboard.org/rss-specification FeedGenerator Mon, 27 Jul 2026 19:55:02 GMT Mon, 27 Jul 2026 19:55:02 GMT BoxLang Query Of Query Improvements and Additions A few days ago, I blogged about how the new Query of Query support in BoxLang has greatly exceeded the performance of Adobe and Lucee's QoQ. If you haven't read that post, you can find it here: https://www.codersrevolution.com/blog/boxlangs-qoq-is-here-and-its-5x-faster-than-lucee-17x-faster-than-adobe I also said we had added a bunch of nice new features to BoxLang's QoQ, which I'll explain today. We built our QoQ support from scratch using a super fast ANTLR grammar based on SQLite which we customized for our needs. This gave us a great opportunity to incorporate some of those features out of the box. ANSI JOIN syntax Lucee and Adobe both support the so-called "table join" syntax where you can list more than one table in your FROM clause with a comma between them. Lucee allows more than two tables as well as table aliases. Adobe not only allows no more than 2 tables, it also doesn't support table aliases, which is a huge bummer. We support not only the ANSI join syntax like INNER JOIN, but we also have NO LIMIT on the number of tables you can join, and of course you can use table aliases. SELECT u.name, d.name as departmentName FROM qryUsers u INNER JOIN qryDepartments d ON u.departmentID = d.departmentID The way aliases are resolved is: You can specify the whole table name as qryUsers.name Or you can specify the table alias as u.name Or you can specify just the column name SO LONG AS the column name is only found in one table ANSI join syntax will produce faster executing queries since we're able to apply the ON clause while we process each join, limiting the number of row combinations the WHERE clause needs applied to. JOIN types This is another huge addition-- filling in a big functionalitygap CF has had for years. CROSS JOIN Lucee and Adobe CF only allow the equivalent of a CROSS JOIN, in which the Cartesian product of both tables is produced and you must filter out the rows you don't want in the WHERE clause. We also support this using the ANSI join sytax SELECT * FROM table1 t1 CROSS JOIN table2 t2 WHERE t1.col = t2.col CROSS JOINs cannot have an ON clause. INNER JOIN This is the most common join type. It creates a Cartesian product of both tables filtering the intersection based the boolean result of an ON clause. SELECT c.name as colorName, f.name as fruitName FROM fruit f INNER JOIN colors c ON f.color = c.name That query will only return rows for colors and fruits which have a match. LEFT (OUTER) JOIN LEFT JOIN, or LEFT OUTER JOIN is a type of outer join that contains one row for every row in the first table. Values for the second table will only be available where the intersection of the ON clause matched rows in the second query. Nulls will be present otherwise. SELECT c.name as colorName, f.name as fruitName FROM fruit f LEFT OUTER JOIN colors c ON f.color = c.name The above query would return one row for every fruit, whether or not it has a matching color in the colors table. RIGHT (OUTER) JOIN RIGHT JOIN or RIGHT OUTER JOIN is a type of outer jon that is the opposite of a LEFT join. It contains one row for every row in the second table. Values for the first table will only be available where the intersection of the ON clause matched rows in the first query. Nulls will be present otherwise. SELECT c.name as colorName, f.name as fruitName FROM fruit f RIGHT OUTER JOIN colors c ON f.color = c.name The above query would return one row for every color, whether or not it has a matching fruit in the fruittable. FULL (OUTER) JOIN FULL JOIN or FULL OUTER JOIN is a type of OUTER join that combines all rows from both tables. If the ON clause does does match one of the tables, nulls will be present. SELECT c.name as colorName, f.name as fruitName FROM fruit f FULL OUTER JOIN colors c ON f.color = c.name That query will return a row for every fruit whether or not it has a matching color, and every color whether or not it has a matching fruit. LIMIT Nand TOP N Adobe does not support LIMIT or TOP. Lucee supports ONLY TOP. BoxLang supports both! SELECT TOP 1 * from qryData or SELECT * from qryData LIMIT 1 In Adobe CF, you must use the maxRows option in the query, but this can ONLY be applied to the overall select, making it worthless to use with unions or nested selects. Table aliases Adobe CF doesn't support any sort of table aliases. Lucee does, and so does BoxLang. SELECT * FROM qryData d or SELECT * FROM qryData as d Subqueries This is another super-powerful feature QoQ has been missing. The ability to take an entire select statement, wrap it up in parenthesis and use it as a subquery. It comes in two forms. FROM or JOIN Subquery You can replace any table name in a FROM or JOIN clause with a subquery in parentheses followed by a table alias. The select inside the parenthesis can have TOP, WHERE, JOINS, ORDER BY, or UNIONS. Basically anything and everything is possible. We only supportnon-correlated subqueries-- meaning they are fully contained and don't reference any tables outside of themselves. SELECT * FROM ( SELECT TOP 1 * FROM qryData WHERE col = 'value' ORDER BY col desc ) d WHERE foo = 'bar' ORDER BY foo or SELECT e.*, s.name as supName FROM qryEmployees e INNER JOIN (SELECT * FROM qryEmployees WHERE supervisor = true) s on e.supervisor = s.name IN or NOT IN Subquery You can also nest an entire nested select inside the parentheses of an IN or NOT IN statement instead of providing a list of values. No alias is needed in this case, as this query can't be referenced in the SELECT clause. Your subquery must select a single column. SELECT * FROM qryUsers where dept IN (SELECT departmentName FROM qryDept WHERE departmentType = 'Excecutive' ) CASE Statements A super-handy feature of most DB's is the CASE statement, which is similar to an if/else or switch statement in CF. There are two flavors of case and we support them both. Case is an expression, so it can be used anywhere, including as the input to a function, WHERE clause, or ORDER BY. Standard Case The first CASE statement doesn't take an input expressions, but expects each WHEN to contain a boolean expression which evaluates true or false. The WHEN statements don't need to operate on the same value since they each have their own condition. SELECT name, CASE WHEN name = 'brad' THEN 'me' WHEN name = 'luis' THEN 'boss' WHEN name = 'jacob' THEN 'Mr. Kansas City' ELSE 'default' END as title FROM qryEmployees Input Case This variation of the CASE statement is more like a switch in that you feed it an input expression which is then matched against each WHEN. In this case, each WHEN isn't given a boolean conditional, but instead an expression to compareto the input expression. This helps reduce boilerplate. SELECT name, CASE name WHEN 'brad' THEN 'me' WHEN 'luis' THEN 'boss' WHEN 'jacob' THEN 'Mr. Kansas City' ELSE 'other' END as title FROM qryEmployees Custom Function Registration This is a really sweet addition. If you've ever wished you could inject a bit of custom logic into your SQL like with how SQLServer lets you register custom functions, you can do this now with ANY functional object. This means, any UDF Closure Lambda BIF (Built in Function) Java method reference can all be registered with BoxLang to be usable as a custom function right inside your SQL. Scalar A scalar function is used to process a single cell from a table and return a single value. queryRegisterFunction( "reverse", ::reverse ); q = queryExecute( "SELECT reverse( 'Brad' ) as rev", [], { dbType : "query" } ); Here we want to tap into the built in function reverse(), so we pass in a functional wrapper to the BIF. ::reverse is a new BoxLang-only syntax. The equivalent in CF would be this slightly more verbose version: queryRegisterFunction( "reverse", ( input )=>reverse( input ) ); Aggregate An aggregate function is used to process zero or more values from a given row at a time. Each argument is passed as an array of values, with nulls removed. queryRegisterFunction( "arrayToList", ::arrayToList, "varchar", "aggregate" ) q = queryExecute( " SELECT arrayToList( name ) as names FROM qryEmployess ", [], { dbType : "query" } ); Here we pass in a BIF functional wrapper around the arrayToList() BIF to create a command-delimited list of the names. Note, we've also added in a first-classstring_agg() and group_concat() function which does the same thing, but this is just a contrived example. Remember, you don't have to just wrap up BIFs. You can pass any custom closure or UDF you want to, which runs any app logic you wish! Functions Adobe supports a pretty limited number of functions. BoxLang supports these functions out-of-the-box. Lucee does as well. Math abs() - absolute value acos() - arccosine asin() - arcsine atan() - arctangent cos() - cosine sin() - sine tan() tangent exp() - returns e raised to the power of a specified number sqrt() - square root ceiling() - rounds up to next closest integer floor() - rounds down to next closest integer Null Handling coalesce() - takes N args, returning the first non-null isNull() - takes two args, returning the second if the first is null String left() - Take left n chars right() - take right nchars length() - return string length rTrim() - trim whitespace on right side of string lTrim() - trim whitespace on left side of string lcase() - same as lower() ucase() - same as upper() concat() - concatenates any number of args into a single string Misc convert() - Same as cast, but allows type to be dynamic. convert( column, "varchar" ) is the same as cast( column as varchar ) We've also added the following aggregate functions, which are popular in other DBs string_agg() - Concatenate values using specified delimiter group_concat() - same as string_agg() Operators We've also added the following bitwise operators that neither Adobe or CF support. Binary ^ - bitwise Exclusive OR(XOR) & - bitwise AND | - bitwise OR Unary ! -bitwise NOT Literals Finally, we also have support for the following literal values. Lucee supports these as well, but Adobe CF does not. true/false null UNION DISTINCT There are 3 ways you can write a UNION in most databases UNION ALL (doesn't de-dupe rows) UNION DISTINCT(de-dupes rows) UNION (same as DISTINCT) Adobe only supports UNION and UNION ALL. Lucee and BoxLang both support the UNION DISTINCT syntax as well, which is the same as just UNION, but it's nice to support this syntax as it's common in other DBs. Conclusion So, as you can see, there's a lot of new stuff going on in our Query of Query implementation. We're excited about the new features and productivity we've been able to add to take QoQ to the next level. If there's any features Adobe CF or Lucee are supporting that we've missed, please let us know! We've already incorporated all of Lucee's QoQ tests from their test suite into our BoxLang test suite and can confirm they are all already passing on BoxLang. The ANTLR grammar built on an existing community project is very easy to modify and make changes to. In the future, I'd love to see the ability to delete rows from a query update rows in a query insert rows into a query all via query of queries. The pieces are in place, so it's just a matter of putting them together to make BoxLang the most powerful platform for CFML and BL code in existence! https://www.codersrevolution.com/blog/boxlang-query-of-query-improvements-and-additions Mon, 13 Jan 2025 20:53:00 GMT brad@ortussolutions.com (Brad Wood) https://www.codersrevolution.com/blog/boxlang-query-of-query-improvements-and-additions BoxLang's QoQ Is Here, And It's 5x Faster Than Lucee, 17x Faster Than Adobe! As BoxLang (our new CF-compatible JVM language) nears its final release, we're very pleased to announce that Query of Query support is ready for testing! QoQ often times draws a variety of reactions from people, but it's a really nice feature to run any SQL select you want against 1 or more in-memory queries for the purpose of filtering, aggregating, or joining. One of the biggest complaints is performance, which is why I've performed 2 rounds of performance enhancements to Lucee's QoQ support in the past which I have detailed here and here. Building on the knowledge and experience I got from overhauling Lucee's QoQ, I incorporated those lessons into the architecture of every part of BoxLang's query object, and QoQ implementation. I'm pleased to have a fully-functional 100% Java implementation of QoQ which supports everything Adobe and Lucee QoQ supportand much more. We're not using an HSQLDB fallback-- everything is pure Java and super optimized. I was able to make some big architectural changes since this was being built from the ground-up. I've also incorporated all of the QoQ unit tests from the Lucee TestBox test suite into BoxLang and they are all passing! Parallel Java Streams are incorporated everywhere for blazing performance SQL data types are tracked at every turn for optimized data operations without needing to cast values Data shuffling is minimized by performing as much work as possible on a stream of integers representing the rows of the query to be selected. The final result isn't built until needed Short circuit scenarios are detected and optimized (like using LIMIT with no ORDER BY can stop iterating as soon as the required number of rows are acquired) An ANTLR parsing grammarbased on SQLite is used for fast parsing that is easily customizable and supports many new constructs that Lucee and Adobe don't support like CASE statements as INNER JOINs (more on this in a coming blog post) SQL Functions are implemented in a pluggable manner, allowing you to register custom SQL functions at runtime to use directly inside your QoQ A minimal locking strategy has been implemented that keeps the query objects thread safe, but with very little contention under multi-threaded use I feel the need... the need for speed. I'll put out another post soon detailing the new syntax features BoxLang has incorporated, which really createsome substantial distance between what it's capable of and what you get with Adobe and Lucee. This post, however, is all about speed. My goal was to match or exceed the performance of the other engines. My first post on improving Lucee's QoQ used a test suite of a single 30,000 row query object. My second post, used a single query of 1 Million rows. Sadly, I couldn't include Adobe CF in the second post because it's performance at that scale was just dismal. I'm pleased to report that Adobe has made some improvements to their QoQ as it was able to hang with the other engines now, so I'll be including all 3: BoxLang 1.0.0-snapshot Lucee Server 6.1.1+118 Adobe ColdFusion 2023.0.12.330713 This time, I'm using 3 query objects (for the join tests). The same data I used in the second post (1 Million rows) and 2 additional smaller queries of around 5 rows each. I'm basically re-using the exact same SQL selects I used in my previous post on Lucee, but I had to "dumb them down" a bit due to Adobe CF not supporting all the same features (like `true` as a literal value, `TOP 1`, table aliases, or the ucase() function). But for the most part, it's all the same as before to keep an even playing field across my tests and not "stack the deck" in a way that favors BoxLang. The proof of the pudding is in the eating Ok, time for the tasty stuff. Each set of numbers represents the milliseconds to complete the query, so a lower number is better because it was faster. I ran each select 3 times to warm up the server, recording the value of the last run. BoxLang outran Lucee and Adobe in every single test. Sometimes by a factor of 80 times faster!You may be wondering where I got the numbers "5x faster than Lucee, and 17x faster than Adobe" from in the blog title. What I did was calculate how many times faster BoxLang was in each of the tests (excluding any tests that did not run on all 3 engines) and then averaged those results. I tried to cover a wide variety of different QoQ examples, but it's worth noting the performance gains you see in your app will be a factor of what your SQL uses and how many rows you have in the queries. Lucee, and especially BoxLang do much better at larger queries and I am using 1 Million rows in each test. For QoQs ran against smaller datasets, the performance difference may be closer to only 2x (the difference of 2ms to 4ms, etc) I have a graph at the end which shows the improvement of each engine over a range of data set sizes. Update: I have published the code used for these tests to a GitHub repo so anyone can run them https://github.com/bdw429s/cfml-qoq-perf-tests Basic Select SELECT name, age, upper( email ) as email, department, isContract, yearsEmployed, sickDaysLeft, hireDate, isActive, empID, favoriteColor, yearsEmployed*12 as monthsEmployed FROM employees WHERE age > 20 AND department IN ('Accounting','IT','HR') AND isActive = 1 ORDER BY department, isContract, yearsEmployed desc BoxLang Lucee Adobe CF 250 ms 3,192 ms 1,809 ms (Click to enlarge) This is a basic non-grouping, non-aggregate select with a WHERE clause, a couple expressions in the SELECT list, and an ORDER BY. BoxLang is 13x faster than Lucee and 7x faster than Adobe CF. Basic UNION SELECT name, age, upper( email ) as email, department, isContract, isActive, empID, favoriteColor, yearsEmployed*12 as monthsEmployed FROM employees where age > 20 AND department = 'HR' AND isActive = 1 UNION SELECT name, age, upper( email ) as email, department, isContract, isActive, empID, favoriteColor, yearsEmployed*12 as monthsEmployed FROM employees where age > 20 AND department = 'Accounting' AND isActive = 0 order by department, name, email, age desc BoxLang Lucee Adobe CF 253 ms 1,220 ms 490 ms (Click to enlarge) Here we have two non-grouping, non-aggregates selects UNION'ed together (DISTINCT is implied). BoxLang is 5x faster than Lucee and 2x faster than Adobe CF. Non-Grouping Aggregate SELECT max(age) as maxAge, min(age) as minAge, count(1) as theCount FROM employees where department IN ('Accounting','IT') AND isActive = 1 BoxLang Lucee Adobe CF 173 ms 573 ms 4,290 ms (Click to enlarge) Here we have simple non-grouping, aggregate select. It only outputs one row, so it's already fairly fast. BoxLang is 3x faster than Lucee and 25x faster than Adobe CF. Grouped Aggregate SELECT age, department, isActive, isContract, count(1) as theCount FROM employees where age > 20 AND isActive = 1 group by age, department, isActive, isContract HAVING count(1) > 3 ORDER BY age, department, isActive, isContract BoxLang Lucee Adobe CF 109 ms 841 ms 8,485 ms (Click to enlarge) This select uses a WHERE, a GROUP BY, a HAVING, an ORDER BY and aggregate functions-- so basicallyeverything! BoxLang is 8x faster than Lucee and 78x faster than Adobe CF. I'm honestly not sure why Adobe is so slow here, but I can say it is consistently slow with large data sets (see the last graph below). Basic String Concat SELECT name + department as departmentName FROM employees BoxLang Lucee Adobe CF 289 ms 377 ms 4,041 ms (Click to enlarge) This simply example highlights the benefits of BoxLang internally tracking all data types to optimize operations without needing to test and cast. We also use StringBuilder wherever possible for best performance. BoxLang is the same speed as Lucee and 14x faster than Adobe CF. Basic LIKE operator SELECT * FROM employees where name like '%Harry%' BoxLang Lucee Adobe CF 116 ms 341 ms 673 ms (Click to enlarge) This is another simple test like above, but a very common use case. BoxLang caches compiled Regex patterns. BoxLang is 3x faster than Lucee and 6x faster than Adobe CF. 2-Table Join SELECT employees.name, departments.name as department, slogan FROM employees, departments WHERE employees.department = departments.name and employees.isActive = 1 and age > 20 AND department IN ('Accounting','IT','HR') order by employees.name BoxLang Lucee Adobe CF 596 ms 2,315 ms 2,107 ms (Click to enlarge) The join tests are new. Lucee and Adobe only support the "comma join" syntax, which is an inherent CROSS JOIN, with the limiting condition in the WHERE clause. Here we have a simple join from our employees table (1 million rows) to our departments table (5 rows). Again, BoxLang was architected in such a way that it doesn't create any intermediate copies of the joined data. Instead, it deals with a Java Stream of int[] arrays representing the intersections of the rows from each table. This is a very lightweight and fast data structure and we only build up the final data at the end. BoxLang is 4x faster than Lucee and 4x faster than Adobe CF. 3-Table Join SELECT employees.name, departments.name as department, slogan, hexcode FROM employees, departments, colors WHERE employees.department = departments.name and employees.favoriteColor = colors.name and employees.isActive = 1 and age > 20 AND department IN ('Accounting','IT','HR') order by employees.name INNER JOIN version (BoxLang only) SELECT e.name, d.name as department, slogan, hexcode FROM employees e inner join departments d on e.department = d.name inner join colors c on e.favoriteColor = c.name WHERE e.isActive = 1 and e.age > 20 AND e.department IN ('Accounting','IT','HR') order by e.name BoxLang BoxLang INNER JOIN Lucee 939 ms 546 ms 2,892 ms (Click to enlarge) Here we lose Adobe CF as it has an imposed limit of only 2 tables being joined at a time. Lucee (via HSQLDB) supports more than 2 tables. BoxLang has gone a step further, and also supports ANSI join syntax (INNER JOIN) which is superior as it allows us to optimize our filtering of the rows since we can apply the ON clause of each join as we process them, limiting the Cartesian product the WHERE clause operates on. Adobe CF is not represented here, but we have Lucee and BoxLang (running the "comma join" syntax) and then BoxLang AGAIN (running the INNER JOIN syntax) to show how it's even better. BoxLang's "comma join" is 3x faster than Lucee, and BoxLang's INNER JOIN is 5x faster than Lucee. The result of this test were not included in the average performance times in the blog title since Adobe does not support this query. Multi-Table UNION SELECT name, department FROM employees WHERE department = 'Accounting' UNION SELECT name, department FROM employees WHERE department = 'It' UNION SELECT name, department FROM employees WHERE department = 'HR' UNION SELECT name, department FROM employees WHERE department = 'Executive' UNION SELECT name, department FROM employees WHERE department = 'Janitorial' order by department, name BoxLang Lucee Adobe CF 1,123 ms 6,989 ms 2,809 ms (Click to enlarge) I have a 2-table UNION above, but I added this test to specifically showcase some optimizations BoxLang implements using using DISTINCT UNIONs. We wait to de-dupe the final result until we reach the last distinct union, preventing unnecessary processing. (UNION is inherently DISTINCT unless you specify UNION ALL) BoxLang is 6x faster than Lucee and 3x faster than Adobe. Grouped Aggregate Across Different Data Sizes SELECT age, department, isActive, isContract, count(1) as _count FROM employees where age > 20 AND isActive = 1 group by age, department, isActive, isContract HAVING count(1) > 3 ORDER BY age, department, isActive, isContract I'm not providing the full data here for brevity, but I'll gladly share it with anyone who is curious. BoxLang Lucee Adobe CF (Click to enlarge) You may recognize this query from above. It's a basic grouped, ordered select. I ran that query roughly 50 times on a data set starting at 20 rows and working its way up to just over 1 Million rows. You can see the relative performance stays close under around 20,000 rows, but then things really start to spread out. Adobe's performance appears to get exponentially worse. Lucee slows down linearly, but at a much steeper angle than BoxLang, which stays VERY fast as the data set grows. BoxLang gets up to 10x faster than Lucee and almost 90x faster than Adobe CF! The result of this test were not included in the average performance times in the blog title since we included this query above already. BoxLang Lucee Adobe CF (Click to enlarge) Let's zoom in our Y axis a little so we can get a better look at Lucee and BoxLang. While Lucee nears 1 second executions, BoxLang barely breaks 100 ms. Conclusion Adobe CF has gotten better in the past, seeing as how it was so bad 2 years ago, I couldn't even include it in the comparison, so that's a plus for them Lucee is overall pretty fast given all the work I've put into it. It also uses parallel Java streams and more internally. BoxLang is still beating the socks off of both of them! Please give your app a test on BoxLang and report any issues you may have with QoQ or log feature requests. https://www.codersrevolution.com/blog/boxlangs-qoq-is-here-and-its-5x-faster-than-lucee-17x-faster-than-adobe Fri, 27 Dec 2024 00:00:00 GMT brad@ortussolutions.com (Brad Wood) Java Lucee Performance https://www.codersrevolution.com/blog/boxlangs-qoq-is-here-and-its-5x-faster-than-lucee-17x-faster-than-adobe Improving Lucee's QoQ Support Again- now 200% faster Update 1/27/2023 - the pull has now been merged into the snapshot builds of Lucee 6! Two years ago, I published this post detailing how I had refactored the Query of Query support in Lucee to be much better and also much faster: https://www.codersrevolution.com/blog/improving-lucees-query-of-query-support I removed the single-threaded HSQLDB callback for grouped/aggregate and distinct selects and tuned the performance. QoQ's are a bit of a polarizing feature in CFML. They've suffered in the past from poor support and poor performance which has caused a lot of people to avoid them. There are certainly places where queryMap(), queryFilter(), and queryReduce() are the best approach, but there are also times where you simply can't beat the readability and conciseness of an ordered, aggregated select. I know developers who tell me they never use "reduce" higher order functions because they are too confusing, but I've never met a developer who didn't understand "GROUP BY department"! I've got several other ideas to increase the feature set of QoQ in CFML including native support for INNER and OUTER joins as well as this idea of exposing CFML string, number, and date functions directly in QoQ. I've also put in a handful of tickets for Adobe ColdFusion to invite them to follow in the improvements I've added to Lucee. I'd send them pulls, if I could : ( The Improvement OK, so let's get into it. This improvement was something I wanted to play with ever since my first foray into Lucee's QoQ code, and that was to use Java's parallel streams to improve the performance. Basically when you process a large query object, the rows are looped over sequentially, which only really uses a single CPU core for the loop. Java Streams are a feature that came out in Java 8 (see cbstreams for the CFML wrapper) and one of their most amazing features is the ability to easily mark a stream as parallel, which will use a pool of threads to process the items. Java Streams use the JDK's default fork/join pool which has as many threads as you have CPU cores. This is perfect for CPU-bound operations and also means the more cores your server has, the better performance gains you'll see! Most of the logic that runs for a Query of Query is inside of a loop. For example: SELECT * FROM qry WHERE col > 5 ORDER BY col will Loop over the source qry to filter records matching the WHERE clause Loop over the matches to build up the new query object Loop over the final query to apply the sort to it Partitioned queries such as DISTINCT, GROUP BY, or any select with an aggregate function have even more steps to organize the data. It only makes sense to tap into all your CPU cores to do this. The Hurdles Generally speaking, this change wasn't too difficult, it just required a lot of testing, and patience to put up with writing Java for a week, lol. As there is not a way to represent a query row as an individual object in Lucee's core (data is stored in columns), I used Java's IntStream class to create a stream from 1 to the recordcount to "drive" the process, where the current int in the stream represented the row number. There were several bits of logic in Lucee that were not thread safe initially. For example, when adding a row to a query, the added row number was not returned, but instead the current recordcount which, in a multi-threaded environment, may not be the specific row you just added, but instead a row another thread added! This is already a knownissue when adding rows in CFML too. I also carefully examined the locking Lucee performs under the hood to keep query objects safe. Java's oldest and simplest locking mechanism is to mark an entire method as synchronized or to use a synchronized block around some code. This is the equivalent of an exclusive CFLock in CFML andwhile effective and fast, is often times heavy-handed. Lucee's query objects use a native Java array in each column for performance, but one huge downside is a native Java array cannot be resized. Once it fills up and you attempt to add a new row, a new larger array must be declared, all data copied over to it, and swapped out. This is fast-ish, but requires all set operations on a query to employ a lock to ensure they don'ttry to set data while the internal native array is being grown. This also means only one thread can call a set() method at the same time! Java has read/write locks, but they perform so slowly under load, they are as bad a synchronized methods. I minimized the locking as best I could and entered a ticket for a future enhancement to completely re-think how Lucee stores data with a growable array of array "chunks". The rest of the hurdles were mostly Java being the annoying, pedantic, strict language that Java is. The functional interfaces for Lambdas also do not allow any typed exceptions to be thrown which is a terrible design on Oracle's partand requires additional boilerplate to work around. Trust me CF devs, you don't know how good you have it! A Few Details My work for this culminated in this ticket:https://luceeserver.atlassian.net/browse/LDEV-4298 And this pull request:https://github.com/lucee/Lucee/pull/1887 Big thanks to Peter Amiri who offered to help sponsor this work. Without this I wouldn't have had the time to complete this effort. You can read more detail there including a new setting to control the parallelism in QoQ if you want. By default, any QoQ on a query object less than 50 rows will execute sequentially (no threads) because the overhead of managing the joining the threads is normally more than the benifit. 50+ rows seems to be where the benifit outweighs the overhead, so all query objectswith that least that many rows will be processed in parallel. I also completely rewrote the ORDER BY functionality. Previous versions would sort the entire query object once for each order by column, starting from the right to the left, to leave the final query in the sort desired. This was greatly improved by not only switching to a parallel stream for sorting, but also writing a new QueryComparator class which sorts all columns at the same time, short circuiting when possible. If the first column's data sorts higher or lower than the other row, there's no need to keep comparing. The new single-pass QueryComparator provided up to 3xfaster speed, and when you add in the parallel streams, large ORDER BY clauses are now up to 5xfaster! It's also worth noting that the outward behavior of QoQ should go untouched with one small caveat. Previously, if you didn't use an ORDER BY clause in your QoQ, the records for a simple select would tend to say in their original order dueto the implicit encounter order of a standard loop. With parallel streams, you can't know which threads will complete first, so the rows may be in an order you don't expect. This, of course, is by-design and the appropriate fix is to use an ORDER BY if the order of the final results is important to you. The Results Ok, time for the good stuff. The asynchronous improvements had a huge affect. I put "200%" in the blog post title and I'm well aware of the ambiguity of that statement, lol. The results vary,but generally speaking, any QoQ of relative complexity completes in less than half the original time, giving you a throughput of more than twice the original. Sowhether you want to call that a 50%, 100% or 200% improvement is up to you :) I found the improvement in performance to be fairly stable regardless of the number of rows in the source query. The larger determining factor of improvement was actually the complexity of the query. So the number of selects, number of group bys, size of the where, number of order bys, etc has the most affect. The more your queries do, the faster they will get. The last time I did this, I used a source query with 30K rows and 100 passes. I really wanted to push the limits this time, so I tested a single pass on a 1 Million row query. I was pleasantly surprised with how Lucee handles 1 million rows in a QoQ in stride! I wanted to compare to Adobe CF again, however I was NOT impressed with Adobe's performance. Once you get over 200K - 300K rows, the wheels start to fall off Adobe's QoQ. Therefore, I did not include Adobe in the comparisons because they couldn't take the preverbal heat in the kitchen. I have one graph that shows Adobe and Lucee side by side across a varying range of query sizes up to 1 Million. Where the graphs say Lucee 5, that is Lucee 5.3.10+97 Where the graphs say Lucee 6, that is the bleeding edge of Lucee 6.0.0.300-SNAPSHOT WITH MY NEW CHANGES that are present in the pull request above. Where the graph says Adobe, that is Adobe ColdFusion2021.0.05.330109 Each set of numbers represents the milliseconds to complete the query, so a lower number is better because it was faster. I ran each test at least once to ensure everything was compiled and warmed up. Query 1 SELECT name, age, ucase( email ) as email, department, isContract, yearsEmployed, sickDaysLeft, hireDate, isActive, empID, favoriteColor, yearsEmployed*12 as monthsEmployed FROM employees WHERE age > 20 AND department IN ('Acounting','IT','HR') AND isActive = true ORDER BY department, isContract, yearsEmployed desc Lucee 6 Lucee 5 1,123 ms 3,595 ms (Click to enlarge) This is a basic non-grouping, non-aggregate select with a WHERE clause, a couple expressions in the SELECT list, and an ORDER BY. The new QoQ is over 3x faster than before. Query 2 SELECT name, age, ucase( email ) as email, department, isContract, isActive, empID, favoriteColor, yearsEmployed*12 as monthsEmployed FROM employees where age > 20 AND department = 'HR' AND isActive = true UNION SELECT name, age, ucase( email ) as email, department, isContract, isActive, empID, favoriteColor, yearsEmployed*12 as monthsEmployed FROM employees where age > 20 AND department = 'Acounting' AND isActive = false order by department, name, email, age desc Lucee 6 Lucee 5 1,567 ms 3,798 ms (Click to enlarge) Here we have two non-grouping, non-aggregates selects UNION'ed together (DISTINCT is implied). The new QoQ is 2.4x faster than before. Query 3 SELECT max(age) as maxAge, min(age) as minAge, count(1) as count FROM employees where department IN ('Acounting','IT') AND isActive = true Lucee 6 Lucee 5 683 ms 1,159 ms (Click to enlarge) Here we have simple non-grouping, aggregate select. It only outputs one row, so it's already fairly fast. The new QoQ is 1.7x faster than before. Query 4 SELECT age, department, isActive, isContract, count(1) as count FROM employees where age > 20 AND isActive = true group by age, department, isActive, isContract HAVING count(1) > 3 ORDER BY age, department, isActive, isContract Lucee 6 Lucee 5 357 ms 1,136 ms (Click to enlarge) This select uses a WHERE, a GROUP BY, a HAVING, an ORDER BY and aggregate functions-- so basicallyeverything! The new QoQ is over 3x faster than before. The next two graphs highlight some serious performance issues that I consider bugs in Lucee 5 and Adobe CF. A small overhead can become a large overhead under enough load. Lucee String Concat SELECT name + age as nameAge FROM employees Lucee 6 Lucee 5 252 ms 5,953 ms (Click to enlarge) This query may look simple, but it performs terrible in Lucee 5. QoQ overloads the "+" operator to perform an arithmetic addition on two values that can be cast to numbers, but it concatenates if both values can't be cast. Previously, Lucee would attempt to cast both values to a number inside of a try/catch, and if it failed, assume they were strings. This bogs down heavily under load and I refactored the logic to use the isNumeric() check instead. Lucee is now 24x faster in this example. I avoided using the "+" operator in the other tests due to this. Adobe LIKE operator SELECT * FROM employees where name like '%Harry%' Lucee 6 Lucee 5 Adobe CF 17 ms 61 ms 3,145 ms (Click to enlarge) This query may also look look simple, but it performs terrible in Adobe CF. Behind the scenes, the LIKE operator is compiled to a regular expression. (Java regex gets compiled!) Lucee has always cached these compiled regex's, but from looking at stack traces, it appears Adobe CF re-compiles the regex every single time! This creates quite a slow down in Adobe QoQ's using LIKE. Lucee 5 was 51x faster than Adobe CF and Lucee 6 is 185xfaster than Adobe CF in this test. I avoided using LIKE in the other tests due to this. I also tried to put in a ticket for Adobe in their bug tracker, but it was acting up that day. I ended up having to E-mail the engineering team to enter the ticket for me and it doesn't show up anywhere so I'm not entirely sure what happened to it :/ This final test shows the relative performance of Lucee 5, Lucee 6, and Adobe CF side-by-side running the same QoQ against source queries of different size. Lucee 5, Lucee 6, and Adobe CF side-by-side SELECT age, department, isActive, isContract, count(1) as _count FROM qry where age > 20 AND isActive = 1 group by age, department, isActive, isContract HAVING count(1) > 3 ORDER BY age, department, isActive, isContract I'm not providing the full data here for brevity, but I'll gladly share it with anyone who is curious. Lucee 6 Lucee 5 Adobe CF (Click to enlarge) You may recognize this query from above. It's a basic grouped, ordered select. Looking at the graph, wowzas! Adobe's QoQ grinds down to a crawl once you get past 300K rows! I can see why people have historically shied away from using QoQ for larger amounts of data. But it's time to challenge those old views-- at least if you're using Lucee! Let's zoom in a bit on that graph so we can see how Lucee 5 compares to Lucee 6 across the same range of query sizes. Lucee 6 Lucee 5 Adobe CF (Click to enlarge) That's better. As you can see, Lucee was alreadyquite stable selecting from large result sets, and these changes in Lucee 6 have a consistent savings which make QoQ's around 2x faster across the board. Conclusion Java parallel streams are pretty cool. (See the xxxEach() functions with parallel=true for a similar CFML experience) I can't wait for Lucee 6 to come out (assuming my pull request gets merged!) QoQ is very much a performant and useful tool, no matter what some people say about it :) Adobe needs to step up their game on QoQ. Lucee is starting to make them wear clown shoes by showing just how good QoQ can be if some time is spent to tune it. I look forward to this getting some testing as Lucee 6 nears release and I'd love to see some performance numbers from anyone who relies heavily on QoQs for reporting or otherwise. https://www.codersrevolution.com/blog/improving-lucees-qoq-support-again-now-200-faster Thu, 01 Dec 2022 20:12:00 GMT brad@ortussolutions.com (Brad Wood) Java Lucee Performance https://www.codersrevolution.com/blog/improving-lucees-qoq-support-again-now-200-faster Detect if a User is Online with RabbitMQ Web Stomp We're using RabbitMQ and its Web Stomp plugin for websockets for several projects at work. Using a Stomp.js library in the browser, our app users connect and subscribe to topics using their username and JWT, which we validate using a custom HTTP back end auth in Rabbit. I've recently written a rest-over-stomp module for ColdBox MVC which allows you to push the response of any Coldbox event or API call out over a websocket channel to any browser listening on that channel. This allows for the following Browser can request data and receive it async Any random server-side process can simply decide to push fresh data out to browser Each user subscribes to a custom topic specific to them (via permissions enforced by my custom HTTP backend auth) so I have a direct data bus to any users's browser Unlike Ajax calls, there is no HTTP/TCP negotiation of each request since the websocket is a persistent connection to the server Detect if a User is Online The reason I wrote this post however was to show a nice trick to detect if a user is online. For example, a back end process on the server decides that some data for a user is out of date and wants to push that data out to the user's browser so they have the latest copy. However, if the user just closed their browser and is no longer online, I don't want to perform the work of collecting fresh data and publishing it to a RabbitMQ topic for which there are no STOMP subscribers. Since I'm authenticating each user with their username and JWT from the browser, I can use the RabbitMQ management API to tell if any given user is online, or at least connected to my RabbitMQ server right then before I push out data over a topic for them. You must have the RabbitMQ management plugin installed to use the API. Here is a simple function that I use inside a CommandBox task runner to detect if a given user is online based on their unique username sent when their browser authenticates to Rabbit. function isUserOnLineRightNow( required string name ) { http url="http://rabbithost:15672/api/connections" username="adminuser" password="adminPass" result="local.result"; if( local.result.status_code == 200 && isJSON( local.result.fileContent ) ) { return !!deserializeJSON( local.result.fileContent ).find( (c)=>c.user == name ); } print.redLine( 'Unable to check RabbitMQ management API for online users. #local.result.statuscode# #local.result.errordetail# #left( toString( local.result.fileContent ), 100 )#' ) return false; } You can hit that API endpoint directly in a browser and enter your Rabbit admin username and password in the basic auth prompt of your browser to see the data. References RabbitMQ-https://www.rabbitmq.com/ Rabbit Management plugin-https://www.rabbitmq.com/management.html Rabbit Management plugin API docs-https://rawcdn.githack.com/rabbitmq/rabbitmq-server/v3.8.14/deps/rabbitmq_management/priv/www/api/index.html Rabbit STOMP plugin-https://www.rabbitmq.com/stomp.html Rabbit Web Stomp plugin-https://www.rabbitmq.com/web-stomp.html HTTP backend Auth plugin -https://github.com/rabbitmq/rabbitmq-auth-backend-http/blob/v3.7.x/README.md REST-over-STOMP ColdBox module -https://github.com/Ortus-Solutions/REST-over-STOMP https://www.codersrevolution.com/blog/detect-if-a-user-is-online-with-rabbitmq-web-stomp Sat, 13 Mar 2021 00:41:00 GMT brad@ortussolutions.com (Brad Wood) CFML RabbitMQ https://www.codersrevolution.com/blog/detect-if-a-user-is-online-with-rabbitmq-web-stomp A quick comparison of using Python and CFML to write the same CLI tool A few days ago, Joseph Lamoree posted about a cool little command line tool he wrote in Python that would scan a list of servers check check for a public facing administrator. https://github.com/ecivis/scan-cfml-admin It's run as follows: (env) $ ./scan.py targets.json I thought this would be a great example to compare and contrast writing the same simple command line tool in CFML using a CommandBox task runner. Here is what I came up with https://github.com/bdw429s/scan-cfml-admin It's run as follows: box task run :targetsFile=targets.json I added some fun stuff in like a progress bar, interactive job ASCII UI, env var replacements in the JSON, and async processing of URLs, but mostly just followed Joseph's spec. If you're curious what the code looks like for each, you can check out the repos above. I didn't really read through any of the Python code, I just wrote mine from scratch, but they're both pretty simple examples to compare and take roughly the same amount of code. Hopefully this will help inspire any CF devs who want to play with some command line tools, but haven't seen examples of how to do it in CFML. And thanks to Joseph for the original idea and letting me use it as a CFML example. https://www.codersrevolution.com/blog/a-quick-comparison-of-using-python-and-cfml-to-write-the-same-cli-tool Wed, 27 Jan 2021 07:47:00 GMT brad@ortussolutions.com (Brad Wood) CFML CommandBox https://www.codersrevolution.com/blog/a-quick-comparison-of-using-python-and-cfml-to-write-the-same-cli-tool Create your own Desktop "Toaster" Popups in CommandBox Servers Here's a quick one that I tried out for the first time today. Someone asked if it was possible for a CF app to have a desktop notification on the server it's running. CommandBox servers have a try icon that runs inside the JVM of the server that can create popups and even Swing windows. Turns out, it's actually really easy to tap into this to get a toaster popup on your desktop. Of course, this wouldn't work if you're running CommandBox as a Windows service or on a headless server like Linux with no GUI! All we need to do is tap into a staticdisplayMessage() method on an internal java class in the server. lu = createObject( 'java', 'runwar.LaunchUtil'); lu.displayMessage( 'Just letting you know', 'INFO', 'The sky is still blue', 5000 ); lu.displayMessage( 'Hold on there', 'WARNING', 'You can''t talk to me like that', 5000 ); lu.displayMessage( 'STOP...', 'ERROR', 'Collaborate and listen', 5000 ); The 5000 is the number of ms until the message hides itself. Pass a 0 if you want the popup to stay until the user explicitly closes it. Here's what it looks like. Have fun with this, but don't abuse it. This comes with no warranties and is probably only really useful for some sort of local development purposes where you want one of your local servers to be able to flag you down when it's done with a process. https://www.codersrevolution.com/blog/create-your-own-desktop-toaster-popups-in-commandbox-servers Thu, 21 Jan 2021 00:16:00 GMT brad@ortussolutions.com (Brad Wood) CFML ColdFusion CommandBox Java https://www.codersrevolution.com/blog/create-your-own-desktop-toaster-popups-in-commandbox-servers Improving Lucee's Query of Query Support One of the really great features of CFML is the ability to run SQL against a result set in memory. This allows you to union separate results together or even apply additional filtering on an exiting result if you can't control what the DB gives you. Like everything, there is a time and a place for this. There are people who strongly dislike QoQ (query of queries) but my take is that I think they're great when used with relatively small data sets and unless performance profiling shows they are causing issues, I have no problems using them. The Lucee developers however have often told people never to use QoQ because they were slow. Recently, I found out why. Ortus has a client moving a large application over from various versions of Adobe ColdFusion to Lucee Server. They have a lot of reports that grab a query from the DB and then chop it up inside loops using QoQ to crunch the data and generate the output. This has worked well for them in Adobe CF for years, but when testing their site in Lucee, the pages slowed to a crawl. And then more than one user hit their site at once, it got even worse! This client asked Ortus/me to take a look at help them figure out why Lucee was so slow. Three QoQ Implementations I received a standalone test file that ran around 1,000 QoQ's in it and I fired up the FusionReactor Profiler to see what was going on. I really can't understate the amount of time saved with FusionReactor here. I have used it in every step of this process. Lucee is also open source, so I began reviewing the implementation of QoQ to see how it worked. The first thing I found is that there are actually three implementations of QoQ in Lucee, and if one fails, Lucee just discards the error and moves on to the next one to see if it fares better. Native QoQ - In memory manipulation of the result set Another? Native QoQ - A copy of the first but... different. I'm unclear why this one exists or in what ways it's different. I think it may have been an experiment at one point that was never removed. HSQLDB via JDBC - A Java-based disk/memory DB called HyperSQL that is accessed via JDBC Poor Native QoQ Support The first two native implementations are very limited. They Don't support group by clauses Don't supporthavingclauses Don't supportaggregate functions Have bugs in their UNION functionality Have bugs in their DISTINCTfunction when used with maxrows Have bugs in their parser that throws errors on valid SQL Have slow performance. Results are sorted up to 3 times in some cases, the distinct function is slow, and the code is not tuned well HSQLDB Fallback So basically, any SQL of medium complexity past a simple select will hit the native implementation and error out. Lucee will catch this error and simply move on to the HSQLDB implementation. These errors happen every single time as well. Lucee isn't smart enough to remember when a query requires HyperSQL. Running the Select in HSQLDB looks like this: An on-the-fly JDBC datasource connection to an internal HSQLDB is created A table is created in the DB and loaded with your data The SQL is passed via JDBC to HSQLDB for execution The result is returned Since there is only a single HSQLDB created behind the scenes, this entire process is SINGLE THREADED to prevent more than one QoQ hitting it at a time. The four main issues here are: The single threaded nature just kills performance under load. Only a single QoQ can run at one time for an entire server. This is a non-starter and explains why my client's site just crumbled under the load of a few users for these QoQ-heavy pages that ran fine in Adobe CF The overhead of a JDBC call can be around 20 ms per call. This may seem small, but you run 1,000 QoQ's in a page and suddenly you have 20+ seconds of nothing but JDBC overhead. FR's query tracing can actually dog pile on this to make it even worse. The HSQLDB engine isn't great and Lucee has no direct control over it to add features and fix bugs Since Lucee discards the errors from the native implementation, there is NO WAY to know when your query has "fallen back" to the HSQLDB implementation other than looking in FusionReactor's JDBC tab to see if the QoQ's were tracked as JDBC calls. The Fix I was able to initially "tip toe" around the issues above by rewriting all their QoQ's to avoid the HSQLDB fallback and implement their aggregate functionality in CFML directly. This worked pretty well and query objects have a lot of really great member functions nowadays which I recommend looking into. However, my client's investment in QoQ involves a LOT of code and there is really no reason why they shouldn't be able to continue using the same code that's run great on Adobe CF for many years. As such, we discussed and decided to sink some effort into Fixing Lucee's native QoQ support so it could support the features they needed and at a much faster performance level without the need to rewrite all their code. This body of work has culminated in this Lucee ticket: https://luceeserver.atlassian.net/browse/LDEV-3042 And this pull request: https://github.com/lucee/Lucee/pull/1026 Update 9/21/2020 - This pull has been merged into the 5.3.8.74 snapshot! I have completely re-worked the native QoQ implementation in Lucee with the following fixes and improvements: group by clausesupport having clausesupport aggregate functionsupport Allow aggregates to reference nested operations including scalar functions ceiling( max( floor( yearsEmployed ) )+count(1) ) Faster distinct support Fix buginess in unions were results aren't distinct by default (union distinct) Improve count() to support count( all col ) count( 1 ) count( * ) count( 'literal' ) count( distinct col ) count( distinct scalarFunc( col ) ) Fix bugs in SQL parser that incorrectly requires only a single space between multi-word clauses is null is not null not in not like order by group by Remove single-threaded limitations Performance tune speed of queries I have also added support for the following system props/env vars lucee.qoq.hsqldb.disable=true – Throw exception if native QoQ logic fails lucee.qoq.hsqldb.debug=true – Log message to WEB context's datasource log any time a QoQ "falls back" to HyperSQL. This could include just bad SQL syntax. The long and short of this should be that your QoQ's run the same they always have. But now they should be much faster since they no longer fall back to HSQLDB for most operations. The one exception right now is performing joins against more than one query object. I'd love to help add support for this, but my client didn't need it and it was a significant amount of additional work. QoQ's with joins will continue to fall back to HSQLDB for now. Test Coverage Every inch of the new implementation has been commented in the code to explain what it's doing and I have added dozens of new tests to the Lucee test suite to cover all of this functionality. The new test suite can be found in my pull request above. https://github.com/lucee/Lucee/blob/9a0b0651b08460a6fdc72695a34533c91a953d6a/test/tickets/LDEV3042.cfc Performance Gains Here's the icing on the cake. I performed a battery of tests against Lucee 5.3.6 (stable) Lucee 5.3.6 (with new LDEV-3042 fixes) Adobe ColdFusion 2018 update 10 The test data was a30,000 row in-memory query object. Each test was run 100 times sync and 100 times async. Total numbers are reported in milliseconds and represent the summed execution times of all 100 runs. Async runs used a pool of 20 threads to simulate concurrent traffic. The new Lucee QoQ implementation is faster in every single test and it's faster than Adobe ColdFusion in 6 of the 8 tests. in fact, the more complex the queries get, the better the new QoQ support works. Here's a PDF of this data. https://drive.google.com/file/d/12iqZhbIh62spdJBQC2MaOroiMf8D4l7L/view?usp=sharing I played a lot with adjusting the size of the query object being selected against and the number of iterations. I found that both variables result in a linear increase in query time, thus fairly irrelevant. I settled on 30,000 rows as a nice large number that pushes the limits of the amount of data someone is likely to have in memory. The 100 iterations just just a good solid number of times to ensure the total result time was large enough to compare with accuracy. The execution of a single query is roughly 1/100th of the total time shown in each box. SELECT name, email, department FROM employees where age > 20 Lucee New Lucee Old Adobe 2018 sync 1,564 ms 2,524ms 1,101ms async 585ms 835ms 1,145ms Note, this SQL was simple enough it didn't "fall back" to HSQLBD which is why the async test is better for the old Lucee. SELECT upper(name), lower(email) FROM employees Lucee New Lucee Old Adobe 2018 sync 2,171ms 2,858ms 1,438ms async 787ms 903ms 678ms Note, this SQL was simple enough it didn't "fall back" to HSQLBD which is why the async test is better for the old Lucee. SELECT top 5 department, coalesce( null, age ) from employees Lucee New Lucee Old Adobe 2018 sync 9ms 6,344ms 853ms async 13ms 6,357ms 499ms Note, Here we see old Lucee falling back to HSQLDB finally which is why the async times aren't any better due to the singlet threading. Adobe CF does not support "top" so I used "maxrows" instead. Adobe CF also doesn't support coalesce(), so I removed it for the Adobe test. The new implementation is so much faster because this select can be optimized with an "early return" after the first 5 rows are processed since there is no grouping or ordering required. I made sure to optimize for these scenarios, when possible. SELECT department from employees WHERE department = 'Executive' ORDER BY name Lucee New Lucee Old Adobe 2018 sync 541ms 7,857ms 691ms async 235ms 7,184ms 210ms Note: Luce old falls back to HSQLDB here. SELECT count(1), avg(age) as avgAge, sum(sickDaysLeft) as totalSickDays from employees Lucee New Lucee Old Adobe 2018 sync 271ms 7,228ms 3,901ms async 134ms 7,277ms 2,438ms Note: Luce old falls back to HSQLDB here. The new Lucee implementation's fast partitioning for the aggregates really starts to shine here over Adobe. SELECT distinct department FROM employees Lucee New Lucee Old Adobe 2018 sync 1,398ms 11,173ms 5,024ms async 403ms 3,899ms 1,633ms Note: Luce old does not fall back to HSQLDB here, but is still much slower due to using a less efficient algorithm for distincting. SELECT department, max(age) as maxAge from employees GROUP BY department HAVING max(age) > 30 ORDER BY maxAge Lucee New Lucee Old Adobe 2018 sync 2,333ms 8,538ms 10,828ms async 687ms 7,795ms 3,928ms Note: Luce old falls back to HSQLDB here. SELECT department as thing, max(age) as age FROM employees GROUP BY department union all SELECT department, min(sickDaysLeft) FROM employees GROUP BY department ORDER BY thing Lucee New Lucee Old Adobe 2018 sync 3,845ms 8,512ms 19,198ms async 1,165ms 8,351ms 8,144ms Note: Luce old falls back to HSQLDB here. Final Thoughts Hopefully this pull request is reviewed and merged into the Lucee core. These changes were sponsored by UNOSOF but they will benifit much more than my client. Lucee 5.3.7 is already in release candidate form, so I've sent the pull request to the 5.3 branch which is where the 5.3.8-snapshot currently lives. Feel free to chime in on the ticket or pull to "encourage" the Lucee team in reviewing this! If you would like to help test this, you can grab one of the following Lucee core files: https://downloads.ortussolutions.com/lucee/lucee/5.3.6.66-QoQ/5.3.6.66-QoQ.lco- For a 5.3.6 stable installation https://downloads.ortussolutions.com/lucee/lucee/5.3.6.66-QoQ/5.3.8.77-QoQ.lco- For the current bleeding edge 5.3.8-snapshot release Files last updated 10/06/2020 Drop the lco file into your server context's /deploy folder and wait up to 60 seconds, or drop it in your /patches folder and restart. If you're using CommandBox, click your tray icon and select "Open..." > "Server Home" to find these folders. They are under WEB-INF/lucee-server/ Test with one or both of the ENV args defined above enabled so you can see the actual errors if the native QoQ fails. Otherwise the HSQLDB "fall back" will cover up the real errors. In CommandBox, you can enable the new flags like so: server set JVM.Args='-Dlucee.qoq.hsqldb.disable=true' or... server set JVM.Args='-Dlucee.qoq.hsqldb.debug=true' Please report any issues to me with a test case and the full stack trace. Also, big thanks to Scott Steinbeck and Zac Spitzer for helping me test this so far. https://www.codersrevolution.com/blog/improving-lucees-query-of-query-support Sat, 12 Sep 2020 01:29:00 GMT brad@ortussolutions.com (Brad Wood) CommandBox Java Lucee Performance SQL https://www.codersrevolution.com/blog/improving-lucees-query-of-query-support Who's had more vulns, take 3: Java, ColdFusion, ROR, .NET It's been 5 years since I wrote my first post comparing the number of CVEs (security vulnerabilities) reported each year by major programming languages and web frameworks and asking Who's Had More Vulns? Who's Had More Vulns- PHP, Java, or ColdFusion? And it's been 2 years since I went back and updated the numbers again to reflect the newest data, still showing there is no statistical support for the notion that CFML is somehow inherently "less secure" than other languages. Who's Had More Vulns Redux- PHP, Java, ColdFusion, ROR, or .NET? The Rub There's a fair amount of disingenuity, or perhaps just willful ignorance to the statistics here and I talk to a lot of people who are astonished that CFML is still in use due to the alleged massive numbers of vulnerabilities. One would think simply touching the code might give you rabies. Some CFers suggested that it's a form of "virtue signalling" by infosec professionals to throw CF under the bus, winking at each other over there shared distaste for a platform they have little knowledge of but assume sucks. I was brought back to this blog "series" this week after coming across this tweet by someone claiming all use of CF should be dropped due to "the number of critical CVEs it generates for very little return" and not being a "real framework" (He didn't explain what that last part meant) So I just got my weekly CERT update.... How the f*** is ColdFusion still around?! Okay, TBH, ANYTHING Adobe, but ColdFusion? It has sucked donkeys since about 2000. Seriously, folks, use a real framework. Sheesh. — Johnnie #CrossIsHere (@beerbikesbacon) October 7, 2019 Who's had more vulns? On that note, I decided to dust off my links towww.cvedetails.comand get the latest data regarding just how many CVEs are being reported by the major web technologies each year over the last 10 years. Is CF really the worst there is? Click to enlarge The Facts No, it isn't. In fact, ColdFusion is decidedly average when it comes to the raw numbers of CVEs that come out every year. Here you can see the last 10 years of CVE reports as publicly viewable onwww.cvedetails.comshowing the trend of yearly vuln reports for ColdFusion, PHP, Apache Tomcat, Java, ROR, .NET, Python, and Node.js. Python and Node are new additions. They have fewer vulns than CF, but I wanted to hit all the major server-side technologies from W3Techs to get a good data set. Java and PHP tip the scales as usual with a huge amount of reported vulns every year. This however, is never held against them, at least not in the same manner that people accuse CF. The rest of the pack is honestly pretty tight. All the major web technologies have a few vulns every year, some more than others, but there are really no stand outs other than Java and PHP. CF is comfortably "in the pack" so to speak. Mind you, I'd love to see CF have zero vulns every year, but it's just silly to pretend it's outpacing others in this regard. CVE Ratings Another thing that's bothered me a bit is the severity assigned to CVEs. Adobe's security team assigns the severity to each CVE and they appear to just have a chart where vulns of a certain type receive a certain severity rating regardless of how exploitable the vuln is. While erroring on the side of caution may seem safe to them, I think it causes unnecessarily bad press. For instance, the entire series of RMI Java serialization bugs are only exploitable on servers with their RMI port exposed, which is a very small number of installations. A simple firewall rule to only allow HTTP traffic squashes external access to this exploit. In my opinion, that should reduce the severity rating. Something should be 10 out of 10 if, say, 75% or more of all installations are sitting ducks to the attack. Then there are CVE's like this onewhich I don't think even deserves to be called a vuln, let alone a "critical" 10 rating. That was, at best, an enhancement to the long standing behavior of uploading files to help protect unscrupulous developers who were storing uploads right into their web root. Yet, Adobe security classified this enhancement as a CVE with the highest rating. I am very thankful for Adobe's quick attention and seriousness that they treat all security issues, no matter how small, but I feel like they go a little overboard sometimes and security flaws are sometimes the only marketing Adobe CF gets in the greater programming space! I'm not sure if Adobe is using it, but the CVSS is a standard for rating vulns thattakes into account (among other things) Access Vector (what sort of access to the target machine is required to exploit) Attack Complexity (How difficult is it to exploit) Exploitability (whether exploits have been publicly released) https://en.wikipedia.org/wiki/Common_Vulnerability_Scoring_System 10 Year Totals Ok, time for another graph. Click to enlarge Just reaffirming where CF sits in the standings. Nestled between .NET/Tomcat and ROR with an average of only 8 CVEs a year, compared to 64 a year for Java and 35 a year for PHP. (Note 2019 data is incomplete) Takeaways Ok, so some takeaways ColdFusion (CFML) is not inherently less secure than other web tech stacks ColdFusion (CFML) does not have a significantly larger amount of reported vulns. In fact, the opposite is true. Adobe tends to have a lot of vulns marked 10/10 Critical, but I think that's a little skewed due to their "abundance of caution" approach. It's interesting to note that a lot of the CVEs this last year were in 3rd party Java libs and required direct access to the ports exposed by CF that a normal firewall would block Adobe did a great job in documenting and fixing their vulns quickly. This is what you expect from a well-supported software. Raw Data And here's my raw data for the graphs above: Java PHP .NET Tomcat CF ROR Node.js Python 2009 44 22 10 8 5 4 0 0 2010 54 35 4 8 6 1 0 7 2011 57 37 6 14 14 12 0 2 2012 47 22 13 15 5 10 0 5 2013 131 13 17 4 13 18 0 2 2014 115 33 10 13 5 12 1 6 2015 80 28 20 3 4 2 1 1 2016 37 106 9 12 5 8 9 5 2017 69 44 4 22 6 4 19 3 2018 56 21 12 8 14 0 18 8 2019 17 25 10 4 9 0 2 9 Average 64 35 10 10 8 6 5 4 Total 707 386 115 111 86 71 50 48 Sources https://www.cvedetails.com/product/1526/SUN-JRE.html?vendor_id=5 https://www.cvedetails.com/product/19117/Oracle-JRE.html?vendor_id=93 https://www.cvedetails.com/product/2002/Microsoft-.net-Framework.html?vendor_id=26 https://www.cvedetails.com/product/887/Apache-Tomcat.html?vendor_id=45 https://www.cvedetails.com/product/8739/Adobe-Coldfusion.html?vendor_id=53 https://www.cvedetails.com/product/22568/Rubyonrails-Ruby-On-Rails.html?vendor_id=12043 https://www.cvedetails.com/product/30764/Nodejs-Node.js.html?vendor_id=12113 https://www.cvedetails.com/product/18230/Python-Python.html?vendor_id=10210 https://www.codersrevolution.com/blog/whos-had-more-vulns-take-3-java-coldfusion-ror-net Wed, 09 Oct 2019 01:36:00 GMT brad@ortussolutions.com (Brad Wood) https://www.codersrevolution.com/blog/whos-had-more-vulns-take-3-java-coldfusion-ror-net A Quick Example Of Functional Programming (FP) In CFML I was adding a feature to CommandBox CLI this week when I typed up some code that iterated over the keys in a struct, filtering out the ones it needed and then performed an action on the matching ones. I used the functional methods structFilter() and structEach() in CFML. They are an example of functional programming as they accept functions as input. This also means we can call them "higher order" functions. But termininologo aside, I typed up the same code using an older interactive approach just to compare the two. It was an interesting and rather self contained example so I thought I'd share it as a real life use case for FP (functional programming). This requirement for this code was simply to take a struct of incoming arguments, filter out any called "scriptName" and create a CLI environment variable for the remaining key/value pairs. The "old" way Here is the way I'd write this code previously in CFML: for( var key in arguments ) { if( key != 'scriptName' ) { systemSettings.setSystemSetting( key, arguments[ key ] ); } } Let's take some notes on this: This style tells the compiler HOW to accomplish the task (loop, set variables, check conditions, etc) This style requires me to handle the looping mechanics This style requires the use of temporary variables to track the loop (which live past the scope of this code) This particular example doesn't modify the arguments struct, but it also doesn't promise not to The "functional" way Here's the way I actually wrote the code this week: arguments .filter( ( k, v ) => k != 'scriptName' ) .each( ( k, v ) => systemSettings.setSystemSetting( k, v ) ); Let's take some notes on this as well: The style tells the compiler WHAT to do, not how to do it (filter the struct, then run something for each item, etc) This style uses closures to encapsulate the action to be taken against each struct element This style has no manual looping This style leaves no local variables littered in memory to track the loops (k, and v are in the arguments scope of each closure) This style, by definition, will leave the arguments struct unchanged as the structFilter() method will return a new struct I also didn't need to dereference the value in the struct since that was conveniently passed to the closure as "v". Another functional way Ok, so both examples are small and easy to read. You may be thrown off a little by the "arrow function" syntax I used in the second example. Adobe ColdFusion doesn't support this still, even though they marked the ticket for it "fixed" before the pre-release. I honestly have no idea what's going on there. Lucee has supported this for quite some time and it's really just a way to strip out as much boilerplate as possible from a closure. It's a little wonky at first for people who are used to all the extra curlies and keywords, but the more you see it, the easier it is to read. Basically, the "function" keyword is missing as is the return statement in the filter (the last expression is automatically returned) as well as the curlies for the function body since both closures only contain a single statement. If that second example is a little hard on your eyes, here's a more familiar version of the exact same code that will also work in Adobe ColdFusion: arguments .filter( function( k, v ) { return k != 'scriptName'; } ) .each( function( k, v ) { systemSettings.setSystemSetting( k, v ); } ); Conclusion So I think all the points I really wanted to make are in the bullets above. Both versions are equally valid and produce the same behavior, but once you get used to telling the compiler WHAT to do and not HOW to do it, it really starts to clean up your code. Gone are "off-by-one" looping errors, nasty side effects on the original data structures (immutability), and reading the code has a more linear flow to me since I don't need to unwrap the loops and if statements in my mind to figure out what it's doing. Extra Credit If you're exceptionally clever, you may have noticed that the each() function passes the same arguments the closure receives directly to the setSystemSetting() function and you may have thought, WAIT A MINUTE-- can't that be simplified even more down to just this: arguments .filter( ( k, v ) => k != 'scriptName' ) .each( systemSettings.setSystemSetting ); In that example we forego the closure entirely and just pass the systemSetting UDF in directly as a reference (note there's no () since I'm not executing it, just passing a reference to it). This WOULD work, however in our case, ColdFusion/Lucee passes a 3rd positional argument which is a reference to the original struct AND setSystemSettings() also accepts a 3rd, optional argument which gets screwed up when it receives in the struct reference instead. So, if you thought that, then good thinking, but in this specific case I couldn't get away with it :) https://www.codersrevolution.com/blog/a-quick-example-of-functional-programming-fp-in-cfml Sun, 30 Jun 2019 01:23:00 GMT brad@ortussolutions.com (Brad Wood) ColdFusion CommandBox Lucee https://www.codersrevolution.com/blog/a-quick-example-of-functional-programming-fp-in-cfml Connect To SQLite DB using CFML via CommandBox Task Runners Here's a quick trick on working with a SQLite DB from CFML quicky and easily. I was playing with the SQLite DB that the original .NET version of GitHub for Desktop and I wanted to access the db file from the CLI to query data and manipulate it. The steps where very easy. Create a Blank Task Runner The very first step was the easiest, and this was to create a blank CommandBox Task Runner: task create --open Download the SQLite JDBC jar I grabbed the latest JDBC driver for SQLite from this URLand I placedsqlite-jdbc-3.23.1.jar in a /lib folder. This line of code at the top of my task runner will classload the jar from the lib folder relative to the working directory of my task.. classLoad( filesystemUtil.resolvepath( 'lib' ) ); Declare the Connection Details Next I created a struct that represents the connection details for the SQLite connection.Thanks to an old post of Ray Camden's to get the correct class name and JDBC URL. var myDSN = { class: 'org.sqlite.JDBC', connectionString: 'jdbc:sqlite:#filesystemUtil.resolvepath( '~/AppData/Local/GitHub/cache.db' )#' }; Note that tilde (~) works in CommandBox even on Windows to reference your user home dir. Run The SQL And finally I ran my query against the SQLite DB using a normal CFQuery tag. You can use queryExecute() if you like. I found this version easier to read since there are no parameters. query name='local.qry' datasource=myDSN { echo( " SELECT key, typename FROM main.CacheElement WHERE key = 'tracked-repositories' " ) } Notice, I'm using Lucee's nice feature of supplying a struct for the datasource details instead of a string.This prevents me from needing to create the datasource. More info here. Finished Product Here is the entire Task Runner in one piece including a line to output the query result to the console. component { function run() { // https://github.com/xerial/sqlite-jdbc/releases classLoad( filesystemUtil.resolvepath( 'lib' ) ); var myDSN = { class: 'org.sqlite.JDBC', connectionString: 'jdbc:sqlite:#filesystemUtil.resolvepath( '~/AppData/Local/GitHub/cache.db' )#' }; query name='local.qry' datasource=myDSN { echo( " SELECT key, typename FROM main.CacheElement WHERE key = 'tracked-repositories' " ) } print.line( formatterUtil.formatJSON( qry ) ); } } I hope this sparks some ideas in your head. Play around and see just how powerful CFML can be from the CLI, especially when we have all of the Java libraries out there at our disposal. https://www.codersrevolution.com/blog/connect-to-sqlite-db-using-cfml-via-commandbox-task-runners Thu, 27 Sep 2018 05:10:00 GMT brad@ortussolutions.com (Brad Wood) CommandBox https://www.codersrevolution.com/blog/connect-to-sqlite-db-using-cfml-via-commandbox-task-runners