25 January 2024 Qlik Optimized Load and Concatenation Share this message At Bitmetric we continuously hone our skills, and we like to help you do the same. That’s why we regularly post a new Qlik certification practice question to our LinkedIn company page. Last Friday we asked the following Qlik Data Architect certification practice question about optimized load and concatenation in Qlik Sense. Do you know the answer? Many (confidently) chose answer C, however, the correct answer was actually B. The correct answer is B Both tables in figure 2 load optimized, but only the first table in figure 1 loads optimized. See the logs from both load statements below (click for larger image). Figure 1 Figure 2 As expected, the relative load time is significantly faster for the fully optimized load, though in absolute terms the difference is neglible. Maintaining optimized load in Qlik Sense We covered the basics of optimized load in our very first Friday Qlik Test Prep question on Qlik optimized load. To summarize, a load statement is optimized when you limit your operations to: Renaming fields (using an alias). You can also load the same field twice under a different alias. This can be useful to create a separate key field. Omitting fields by not including them in the LOAD statement Use a single WHERE EXISTS, with a single parameter. So WHERE EXISTS([Country]) is OK, WHERE EXISTS([Country], [ISO Country Code]) is not. JOIN, KEEP or CONCATENATE with another table LOAD DISTINCT will also keep a load optimized. The DISTINCT part will be processed after the LOAD however, so you might still want to think twice before applying it to very large QVDs. Neither LOAD statement breaks any of these rules, so there must be some additional factors to take into consideration. Let’s see what they are. Maintaining optimized load when concatenating When concatenating tables, an additional rule applies: subsequently loaded tables must contain all the columns from the tables that were loaded before. However, previously loaded tables do not need to contain all columns from tables that are loaded later. Visually: Optimized load maintained Optimized load broken So, to maintain an optimized load while concatenating tables, all we have to do is load and concatenate the tables in ascending order of width. Sounds simple, right? Unfortunately, it can get a bit more complex than that. Tables might not always share the same columns. A column that appears in one table, might be missing from another table, and vice versa. How to deal with missing columns Consider the scenario below. We first load Table 1 with three columns, to which we concatenate Table 2. As the second tables misses some columns from the first table (B and C), the resulting load will be unoptimized. To fix this, we need to add the missing columns to the second table. Theoretically, this ensures the load remains optimized. There’s a practical issue though: if we create the missing fields in the script with a null() function or an empty string, optimized load is broken. To solve this issue, we need to go upstream, to the moment when the QVD files are created. Adding dummy columns to QVDs A common approach is to add null values with the proper field name for each ‘overlapping’ column in the QVD, as shown in the script below. This solves the issue, but isn’t optimal. It can be hard to predict which QVDs will need to be combined in the future. Different QVDs may be combined in different ways in different apps, which means that we’ll potentially need to create a lot of empty fields, cluttering files and reducing ease of maintenance. Adding a universal null column An alternative solution is to add a single null column to your QVDs by default (typically only the fact table QVDs). This column can then be flexibly aliassed for each column that needs to be created. No need to go back to the upstream app creating the QVD. In the script below, the QVD contains an empty field called [Null] which is loaded as the (non-existing) columns [A] and [B]. The result: the optimized load is maintained during concatenation, with only a single additional column in the source QVD: To wrap it up, by lining up your tables in the right order and adding either dummy columns or a universal null column, you’re not just keeping your load optimized; you’re playing it smart. This isn’t just about making your Qlik Sense work faster; it’s about making it work smarter. And as we’ve seen, with a bit of know-how and a few strategic moves, you can keep your data flowing smoothly, no matter how tricky the concatenation gets. So, go ahead, put these tips to the test in your next Qlik Sense project and watch the magic happen! More from the Bitmetric team Take your Qlik skills to the next level! Since 2013, the Masters Summit for Qlik is the premier advanced training for Qlik. Join us in Vienna and take your Qlik skills to the next level. Join the team! Do you want to work within a highly-skilled, informal team where craftsmanship, ingenuity, knowledge sharing and personal development are valued and encouraged? Check out our job openings. Friday Qlik Test Prep Performance Script Solution How can we help? Barry has over 20 years experience as a Data & Analytics architect, developer, trainer and author. He will gladly help you with any questions you may have. Call us Mail us 8 October 2024 Artificial Intelligence, Machine Learning, and Deep Learning Explained: How They Impact Your Business In today’s rapidly evolving technological landscape, Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) are transforming industries and redefining how businesses operate. In this blog post, we will break down these three definitions and elaborate on them. AI 25 September 2024 Building Ethical AI: Practical Frameworks for Responsible Innovation AI is transforming industries with innovation and efficiency. But with great power comes great responsibility. The real question is: How do you turn ethical principles into actionable guidelines for AI development? And what steps should your team take to make it happen? AI 17 September 2024 What is AI Ready Data Data quality is all about how accurate, consistent, complete, and up-to-date your data is. If your data is good, you’ll get reliable insights and be able to make smarter decisions. It’s a key part of making sure your AI and machine learning projects are successful. AI Qlik
8 October 2024 Artificial Intelligence, Machine Learning, and Deep Learning Explained: How They Impact Your Business In today’s rapidly evolving technological landscape, Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) are transforming industries and redefining how businesses operate. In this blog post, we will break down these three definitions and elaborate on them. AI
25 September 2024 Building Ethical AI: Practical Frameworks for Responsible Innovation AI is transforming industries with innovation and efficiency. But with great power comes great responsibility. The real question is: How do you turn ethical principles into actionable guidelines for AI development? And what steps should your team take to make it happen? AI
17 September 2024 What is AI Ready Data Data quality is all about how accurate, consistent, complete, and up-to-date your data is. If your data is good, you’ll get reliable insights and be able to make smarter decisions. It’s a key part of making sure your AI and machine learning projects are successful. AI Qlik