Recently, we upgraded Tableau from version 8.0 (32 Bits) to 8.1.7 (64 Bits) along with Automatic Login (Single Sign On) and PostgreSQL configuration (Access to the Tableau system Database). After the successful upgrade of this application and configuration of the named features, I would like to share my experience as they may be useful in case you face a similar situation.
Based on Clariba’s specific expertise and successful completion of all requirements of the SAP® Recognized Expertise Program, Clariba Consulting Middle East has been granted the SAP Recognized Expertise designation in SAP HANA® and Business Intelligence.
These days almost every company store data such as users, emails, roles, printers, etc. in what it is call a Directory Service, providing a centralized access to company-related data that can be use across all its systems. One of the most widely used solution is Active Directory from Microsoft, since its tightly integrated with Windows OS systems. The other one LDAP, which stands for Lightweight Directory Access Protocol, has a broad range of users too since the de facto standard in Linux-based systems.
Given the fierce competition and the ever-soaring demands of customers, small and medium-sized enterprises (SMEs) today need to invest in having BI capabilities to stay relevant and meet these demands. Yet, with scarce resources available, the alignment of business and IT is vital to ensure success, and new options for hosted BI solutions freeing up cash flow are attractive options for SMEs.
SAP BusinessObjects (BO) BI 4 Platform has come with many improvements that have made the system more complex. Rep-Up-GraphWe, as BI 4 administrators, are required to know how this application works in order to apply the proper optimization improvements to the platform, when it is required. In this blog article, we will cover the most important concepts that need to be considered as a first step to increase the performance of the BI platform. It is very important for a good designer to understand the conceptual tiers in the BI Platform: Application Tier, Intelligence tier and Processing tier. We will refer to each one of them and we will nail down some optimization tips.
Don’t you just hate it when you see a report with DIV/0 errors? Thankfully, Web Intelligence provides us with a function to handle these types of errors, allowing us to provide a more meaningful message.
Let's look at IsError:
- IsError() is a function which will return a boolean value.
- A 1 is returned is the formula being evaluated is in error. A 0 if there is no error.
- This function is particularly useful when embedded into an IF statement.
Let's look into this with an example:
- I’m going to create a variable in a WebI document called formula.
The value for this variable is =6/0.
If we use this variable in the report, we can see something that the picture below:
2. So now is when we can use the function mentioned above.
I create a new variable called IsError to manage this error.
As you can’t divide by zero, a DIV/0 error has occurred and a 1 is returned by the IsError statement.
The IF statement then presents one of two outcomes depending on this boolean value.
In the case of an error, the user is presented with some meaningful text:
“The denominator object is 0”
So if we use this new variable in the report, we can see something that the picture below:
As you can see, using the new variable IsError you can’t see the error message that we got with the variable formula.
3. Now, I’m going to change de value for the variable called formula as below:
The result that we get is the same for both variables:
As conclusion, it is good practice to include error handling into your reports, particularly when you are performing any type of division task.
For the user, it’s much better indicate an explanation of the displayed error that show only the message that WebI provides.
Wrapping an If IsError check around your formula ensures that these type of errors are handled in the correct manner and should help the user identify how the problem can be fixed.
You can manage other kind of error doing the same process.
First impression when using Microsoft SQL Server Integration Services (SSIS), the developer’s first impression is the tool provides powerful features in addition is simple to use. Although this is true, there are some common errors that can make us waste our time. The purpose of this blog article is to cover some of these very common issues and save some time when starting to develop a solution with Microsoft SSIS.
As a quick overview, SSIS is a component of Microsoft SQL Server that we can use to perform data integration tasks. Microsoft provides a visual tool for the ETL Development based on Visual Studio called SQL Server Business Intelligence Development Studio.
- Solution Explorer: Contains all the project files, mainly: Packages, Folders and Project configuration managers. The configuration managers are just configuration files for providing connectivity to sources (flat files, databases…). There are two types depending on the ownership, package and project connection managers.
- Properties tab: Displays properties of the selected item.
- Connection managers: Here we have visibility for all the connection managers used by the package (project and package connection managers).
When taking the first steps with the tool, there are some common errors that can make us waste a lot of time and that can become a challenge once we deploy the solution in a server. We can summarize these problems into the following points:
- Configuration of 32/64 environment
- Protection Levels
- Network drives
Configuring for 32/64 environment:
The machine may not have 64 bit for Microsoft Excel, Oracle or ODBC data provider. In order to configure the project to be executed in 32-bit mode, we have to configure it through the project properties. If we go to the Solution Explorer view, click on properties->debugging, we can switch the 64-bit mode:
Next step after your development is done is probably to deploy the ETL in a server. The project is deployed in an Integration Services Catalog in an instance of SQL-Server. The SQL Agent will be then responsible for executing and scheduling the packages. All this process is detailed described in the following link:
Once deployed in the server, we can also test whether this is going to be executed in 32 or 64 bit (in the weird case the server is using a 32-bitruntime). We can switch the 32 or 64 mode if we click in the job->properties->steps->step property-> 32-bit runtime.
An SSIS project contains confidential data for many purposes. Most probably contains connection manager files in order to be able to connect to different data sources. Within theses files, SSIS stores the credentials for the data source. These credentials can be configured to be encrypted, depending on the protection level. SSIS provides the following options for protection level:
- DoNotUseSensitive: Does not store any info
- EncryptAllWithPassword / Encrypt Sensitive with password
- EncryptAllWithUserKey / EncryptSensitiveWithUserKey
- Server Storage
More details about the protection levels can be found here: http://blogs.prodata.ie/post/SSIS-Package-Protection-Levels-and-their-Usage.aspx
Usually there are not many problems when developing in your machine. Many problems come up once the package is deployed and the SQL Agent tries to read out data from them; if you debug this process you will get a connection attempt error. Switching between these options can help us when troubleshooting.
However, most common problems come from one of those reasons:
- Protection level is different at project level and package level.
- Protection level is configured as encrypt with user key and SQL Agent user is different from the project owner.
We can change the package protection level by going to the control flow and in the Properties tab (down right corner in the image below).
Whereas the protection level at project level can be found by right click Project-> Common Properties-> Security -> Protection Level.
Following the same procedure, we can see the username who created the package/project. If encryption mode is enabled (by default it is) then we need to make sure that the project/packages ownership matches the username in the SQL-Server Agent Service. We can access this information by clicking Control PanelSystem and SecurityAdministrative ToolsServices and then properties of the SQL Agent service->Log On
A common issue when loading data from flat files appears when these files are located in a shared drive. SQL Agent will not recognize network paths if they are not in UNC format (read more about the UNC in http://compnetworking.about.com/od/windowsnetworking/g/unc-name.htm ), whereas SSIS works fine with them. It may be confusing since when the developer deploys the package in the server everything seems to be fine. But at the time the job executes the SQL agent will report: Path not found error.
Since we can not specify the UNC path with the SSIS GUI, we need to write it within the connection manager XML file and manually introduce the UNC path in the DTS Connection String:
DTS:ObjectName="Web Portal SOURCE"
DTS:ConnectionString="dc01fpnv04public$My Number My IdentityExclusionsWeb-portal dataWeb portal weekly Submission Report new1.csv" />
We can access this file by right click in the connection manager -> View Code:
BI and data analytics are hitting the top trend and prediction lists in 2014 and beyond. We have highlighted our top 4 analytics trends from these lists.
Many online publications and analysts have recently published their top technology trends and predictions in 2014 and business intelligence, big data and data analytics are topping the list. They are singing to the same tune as us! I recently wrote an article called Analytics Like Never Before and I shared my insights on the trends of data analytics in the Gulf region and I predicted that the revolutionary wave that is currently happening in the BI and analytics market will not stop. The winners shall be those organisations that seize the opportunity to re-imagine their business to drive competitiveness and profitability.
1. The business takes ownership of process and intelligence.
Forbes.com summarized Forrester’s list: Technology Trends To Watch: 2014 To 2016 and the number 4 trend is The business takes ownership of process and intelligence. In the list, Forrester highlights that IT is losing its control over business intelligence platforms, tools, and applications often due to IT’s inability to operate at the increased pace of the business.
2. Businesses get proactive by leveraging customer data.
CIO.com lists 12 Big Data Predictions for 2014 and looks at 12 ways the technology and the market will evolve in the coming year and we have highlighted number 4: Businesses get proactive by leveraging customer data. Cio.com states, “Companies are going to get serious about leveraging the digital breadcrumb trail customers leave as they interact with products and services online, according to Gainsight. To get there, data analytics have to drive beyond the BI team to provide business value throughout the company. ” We already see companies such as Spinney’s, a large supermarket retail chain in Dubai, unleashing the power of their customer data. Read more about Spinney’s.
3. Big data and analytics get real.
Baselinemag.com published their perspective on the 6 Top Tech Trends to Watch in 2014. They stated, “the (data analytics) technology is now advancing at a rapid pace and, thanks in part to clouds, better ways to extract data and next-gen analytics tools, organizations are able to transform a growing mountain of data (including unstructured data) into knowledge.”
4. Only 40% of CIOs will rise to produce business-enhancing insights from big data and analytics.
International Data Corporation (IDC) hosted a webinar “IDC 2014 Predictions: CIO Agenda - Embracing 3rd Platform Leadership Challenges as IT Transitions from Technology to Service Delivery”, highlighting the top 10 market predictions for the year ahead. And big data and analytics made the list at number 2: Before 2017, only 40% of CIOs will rise to produce business-enhancing insights from big data and analytics. To read and replay the webinar, go here.
“Analytics like never before” is a theme used by a BI expert, Timo Elliott. I couldn’t agree more and we will see it take off in 2014 and beyond. Happy New Year!
Marc Haberland is the Managing Director of Clariba. Marc has more than 15 years experience in Business Intelligence (BI), analytics, strategy management and Enterprise Performance Management (EPM) across telecommunication, education, healthcare, manufacturing, banking and public sectors. Marc leads a team of 30+ BI and analytics experts who deliver innovative, reliable and high-quality business analytics solutions, providing its customers with clarity and actionable insight to improve their business performance.
As an industry thought leader, Marc shares his insights on hot topics and trends through various speaker engagements, publications, discussion groups and his company blog.
A recent article in the Harvard Business Review by Thomas Davenport, professor and fellow at MIT, caught my attention. Davenport refers to the beginnings of a new era: “After Big Data” in his article Analytics 3.0. While organizations built around massive data volumes — such as Facebook, LinkedIn, eBay, Amazon and others — may be ready to move to the next wave of analytics, the full spectrum of big data capabilities is yet to be realized in the traditional organizations we work with on a daily basis.
However, despite the ever-improving technical advances and the continuous innovation of our customers, I am finding a common theme that also Davenport refers to in his article: “Competing on analytics not only in the traditional sense (by improving internal business decisions) but also by creating more valuable products and services” requires the resolve of the companies’ management, cross-disciplinary teams and more change management than ever. He mentions UPS’ Orion system for the optimization of routing for parcel delivery, in which more time was spent on change management than the actual technical implementation.
As analytics become more embedded into companies’ business operations, decision-making becomes more agile, more information-driven, it is critical that organizations not only implement the technology, but more importantly organize for successful analytics and ensure executives to employees are prepared for the inevitable change.
Over the past two years, I have spent considerable time working with companies to break down the traditional barriers of IT and business by supporting the design and implementation of the most optimal, yet dynamic structure to allow the organization to adapt to this new reality. As an example, the results of this transformation in several operating companies of the worlds largest telecommunications group have been tremendous. With continued pressure on headcount and operating expenses, these telecommunications companies were able to save greater 30% on OPEX, while increasing the satisfaction of business users with improved agility, governance and insight through self-service analytics.
The implementation of a new analytics focused, multidisciplinary organizational structure, an “Analytics Center of Excellence”, reporting directly into the most suitable C-level executive (i.e. CFO / CSO / CEO), ensures the necessary executive sponsorship and will provide the platform to prepare the organization for the next level of analytics. Only a focused approach — with information and analytics identified as a strategic asset of the company plus the change management to align IT and business functions — embed self-service BI in the day-to-day activities of line managers and employees alike will allow the organization to reap the benefits of the technical capabilities available today and into tomorrow. Perhaps it is time for your organization to review how it is approaching analytics?
Marc Haberland is the Managing Director of Clariba. He looks back on 14 years of Business Intelligence (BI) and Enterprise Performance Management (EPM) experience across many industries throughout Europe, the Middle East and North America. Marc currently leads his team of BI experts in Qatar, the UAE, Oman and Spain to deliver projects across the Gulf, KSA, in Europe and North America. Prior to working in the BI space Marc held various management positions in sales, marketing and product management with Picis and Hitachi Europe. Marc received his Masters in Technical Marketing and his Bachelor in Electronics Engineering from Polytechnical University in Munich, Germany. Marc can be contacted at firstname.lastname@example.org
Are you satisfied with the way you currently manage the dependencies in your ETL? Dependencies between jobs (or parts of jobs) are an important aspect of the ETL management. It pertains to questions like: Do you want to execute job B if job A failed? Imagine that you have a job C with sub-job 1 (usual runtime: 3 hours) and sub-job 2 (usual runtime: 2 minutes). If sub-job 1 was successful and sub-job 2 failed, can you gracefully restart job C without the sub-job 1 being restarted again?
As soon as you have more than 1 simple job, you have to manage your dependencies. In this article (part 1 of a series of articles about ETL Dependencies Management) I’ll first list some of the characteristics I’m looking for in an ideal dependency management system. I will then have a look at some of the possibilities offered by SAP Data Services 4. In part 2 (my next post), I will propose the architecture of a possible dependency management system. In part 3, I will go into the details of the implementation in Data Services. I’ll finish with part 4 by telling you about how the implementation went, and if some improvements are possible.
The ideal dependency management system
In this post I will use the word “process” to design a series of ETL operations that have a meaning together. Example: extract a source table, create a dimension, or update a fact table. The objective here is to manage the dependencies between the processes: updating a fact table should probably only be allowed if updating the corresponding dimensions was successful.
A dependency management system should ideally have at least the following characteristics:
- Run a process only if its prerequisites ran correctly
- After a failure, offer the option to re-run all the processes or only the processes which failed
- Trace the outcome of each process (ran successfully, failed, did not run)
- Run dependent processes dynamically (rather than statically, i.e. based on date/time)
Let’s enumerate some of the possibilities offered by Data Services, with their respective pros and cons.
1) One job with all processes inside. This is very easy to implement, dynamic in terms of run times, but it doesn’t allow for concurrent runs. Most importantly, it means that failures have to be managed so that the failure of one process does not stop the whole job.
2) One process per job, with jobs scheduled at specific times. This is very easy to implement, allows concurrent runs, but is not dynamic enough. If the process durations increase with the months/years, jobs may overlap.
3) One main job calling other jobs (for example with execution commands or Web Services).
4) One process per job, all the jobs being scheduled at specific times, but checking in a control table if the pre-requisites ran fine. Otherwise they just sleep for some time before checking again.
5) Use the BOE Scheduler to manage jobs based on events (how-to is well described on the SCN). I’ve not tested it yet, but I like this approach.
By default, the first two possibilities only manage the “flow” side of the dependency management (after A, do B). But they do not manage the conditional side of the dependency management (do B only if A was successful). In both cases, a control table updated by SQL scripts would allow the ETL to check if the prerequisite processes have been run correctly.
What I don’t really like in the solutions 2 to 5 is the fact that it’s difficult to have an overview of what’s going on. You cannot really navigate within the whole ETL easily. The solution 1 gives you this overview, but at the cost of having a potentially huge job (without the possibility of processes running concurrently).
Also note that the solutions with multiple jobs will need to manage the initialization of the global variables.
What I miss in all these solutions is an optimal re-start of the ETL. If 10 of my 50 processes failed, and I want to restart these 10 only, do I really have to start them manually?
In my next blog post I’ll propose an architecture that addresses this optimal restart.
Until then, please let me know your thoughts about how you manage your ETL dependencies. Any of the 5 solutions mentioned before? A mix? Something else? And how well does it work for you.