2 november 2022 How to do an incremental BUFFER LOAD in Qlik Sense Deel dit bericht Every Friday at Bitmetric we’re posting a new Qlik certification practice question to our LinkedIn company page. Last Friday we asked the following Qlik Data Architect certification practice question about incremental loads using BUFFER (incremental) LOAD: It was interesting to see that we received quite a mixed bag of responses. Some people went for the ‘logical’ answer, others for the ‘sales’ answer, but of course we were looking for the ‘certification’ answer 😉 The correct answer is B: Use the BUFFER (incremental) prefix Whereas Answer A is obviously nonsense we also saw quite some comments that answered C or D. To be fair, in our consulting practice we will most likely also opt for answer D. Depending on the scale and future requirements we might even opt for answer C. From the Qlik certification perspective though, there is only one right answer: B. Let’s see how it works. Buffer load statement The BUFFER load statement is used in front of a LOAD or SELECT statement and tells Qlik to automatically create a QVD file of the result and use this to cache/buffer subsequent loads. The statement is as follows: Buffer [(option [ , option])] ( loadstatement | selectstatement )option::= incremental | stale [after] amount [(days | hours)] After using the buffer statement a QVD buffer file is made and stored in the buffer subfolder. This is located in the Qlik program data folders. Buffer QVD files are removed when the app that created them no longer exists or no longer references the created buffer. The same thing happens when you change the fields you load. If you make any change to the load or select statement the buffer will be deleted. This happens because the buffer filename is calculated by running a 160-bit hexadecimal hash function on the entire select and load statement. When to use? The buffer load is valuable in it’s ability to reduce load times by buffering the loaded table into a QVD. As exampled in this question, this is best used with large files like logfiles. Logfiles have the advantage that they are incremental in their nature by default. All new created logs are appended to previous loaded logs and no changes happen in the old data. How to use? In the following examples we are loading an almost 40 million row logfile in various ways. Let’s see what the difference between the different loads is, how buffer exactly works and what the different advantages and disadvantages are. Incremental load: First of we do an incremental load of the logfile: This results in the following load time: As we can see the total load takes one and a half minutes. Now let’s see what the buffer load does: Buffer incremental load: This results in the following load time: The buffer load only takes 44 seconds. But wait, there is more to get here. Since buffer, as we have learned, stores a buffer QVD by itself, we don’t need to store the log file. In that case we can even win some more time: As we can see this brings the buffer reloading time back to only 1/3 of the incremental load. Why does this happen? The answer to why the buffer load in this case is faster lies in the way the incremental load of the buffer works. The first incremental load works on a where exists clause. This means that it needs to fully read through both the stored log.qvd and the logfile to check on the logid field and see if they already exist or not. Buffer works in a different way. The buffered QVD has a header with metadata: In this metadata we can find the sourcefilesize field. This is the size the previous read version of the sourcefile (the logfile). When using the buffer incremental load syntax, Qlik simply skips to that position in the input file and starts loading data from that point onward. This is the reason why this works well with logfiles, since they don’t (or shouldn’t) change after a line has been written. In case you have files where historical data can still change, this is not a right approach as we can see when we list all the advantages and disadvantages of buffer incremental load versus where exists incremental load. Advantages and disadvantages of using BUFFER (incremental) Advantages Easy to script and understand Faster load times compared to where exists Good solution for loading large logfiles Disadvantages Changes in the load/select statement structure delete the buffer Less control over the resulting file (stored in the Qlik program data folder) Does not work when historical data can change That’s it for this week. See you next Friday? More from the Bitmetric team Take your Qlik skills to the next level! Since 2013, the Masters Summit for Qlik is the premier advanced training for Qlik. Join us in Vienna and take your Qlik skills to the next level. Join the team! Do you want to work within a highly-skilled, informal team where craftsmanship, ingenuity, knowledge sharing and personal development are valued and encouraged? Check out our job openings. Friday Qlik Test Prep Performance Script Solution Hoe kunnen we je ondersteunen? Barry beschikt over meer dan 20 jaar ervaring als architect, developer, trainer en auteur op het gebied van Data & Analytics. Hij is bereid om je te helpen met al je vragen. Bel ons Mail ons 8 oktober 2024 Artificial Intelligence, Machine Learning, and Deep Learning Explained: How They Impact Your Business In today’s rapidly evolving technological landscape, Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) are transforming industries and redefining how businesses operate. In this blog post, we will break down these three definitions and elaborate on them. AI 25 september 2024 Building Ethical AI: Practical Frameworks for Responsible Innovation AI is transforming industries with innovation and efficiency. But with great power comes great responsibility. The real question is: How do you turn ethical principles into actionable guidelines for AI development? And what steps should your team take to make it happen? AI 17 september 2024 What is AI Ready Data Data quality is all about how accurate, consistent, complete, and up-to-date your data is. If your data is good, you’ll get reliable insights and be able to make smarter decisions. It’s a key part of making sure your AI and machine learning projects are successful. AI Qlik
8 oktober 2024 Artificial Intelligence, Machine Learning, and Deep Learning Explained: How They Impact Your Business In today’s rapidly evolving technological landscape, Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) are transforming industries and redefining how businesses operate. In this blog post, we will break down these three definitions and elaborate on them. AI
25 september 2024 Building Ethical AI: Practical Frameworks for Responsible Innovation AI is transforming industries with innovation and efficiency. But with great power comes great responsibility. The real question is: How do you turn ethical principles into actionable guidelines for AI development? And what steps should your team take to make it happen? AI
17 september 2024 What is AI Ready Data Data quality is all about how accurate, consistent, complete, and up-to-date your data is. If your data is good, you’ll get reliable insights and be able to make smarter decisions. It’s a key part of making sure your AI and machine learning projects are successful. AI Qlik