The universe is a big place… perhaps the biggest
SO, IN SAP LAND, WHY USE A SEMANTIC LAYER ON TOP OF A CUBE?
That’s pretty much the main question I’d like to answer, or better, discussion I’d like to start. I don’t think there’s a single answer to it, nor am I trying to promote the blanket use of semantic layers on top of any multi-dimensional cube. Some prefer iPhone, some prefer Android; both achieve the same thing (mostly).
This won’t be a complete technical step-by-step tutorial either. I will try to highlight as many technical points, issues and gotchas as possible though. And yes, there are a few, as in SAP land nothing really is “simple” (pun intended). But that is actually the main reason why I’ve wanted to use Universes (our good old Business Objects semantic layers) on top of a cube, and over the past 3 years HANA views too, in order to keep things simple.
Which will probably give me the main answer of why I’d put a Universe on a BW Cube; because it makes reporting simple.
The whole premise of Semantic layers was to disconnect the non-technical business users from the technical IT and database infrastructure. This was achieved by presenting a business-focused layer, using real business terms, descriptions and help texts, and encapsulating the technical configurations within the semantic layer, away from those business users. After that, it was just drag-and-drop of those objects to create queries, add any filters/conditions by dragging some more, and then dragging those retrieved data objects into your cells, tables and charts and off you go. Drop in a few logos, page numbers, headings and titles, dates etc. and voilà, we had a highly formatted report. Add some fancy interaction, filters, input controls, drill-down, hover-over text, documents and other links, and it just kept on getting better and better. View it on your business PC, tablet, and even mobile phone these days. How good’s all of that!?
Just to put things in historical perspective, I’m a Business Objects “traditionalist”. Meaning I’ve known the products pre-SAP takeover, pre-Crystal Merger, pre-WebIntelligence (that lovely black interface), and yes, even pre-graphical Universe Designer!
Post SAP takeover, myself and others struggled in coming to terms with how and when SAP would integrate their own BI/Business Warehouse with Business Objects. And arguably that journey is far from complete. Also, the Universe seems to have fallen out of grace, not just with SAP.
Star Trek Borg Cube, circa 24the century
SAP have hardly promoted the use of Universes in their landscapes over the past few years. Arguably, they’re not as necessary anymore as when Business Objects were independent. BEx queries and BW Cubes are intertwined as are WebI and Universes. And BEx queries are probably better performing as well. A Universe on a single HANA analytical view is almost “boring”, and the Business Layer’s folder structure may just not be necessary as we’ve got everything in the single HANA view in our Data Foundation (so why create more work?). HANA direct reporting in BI4.2 will make it even less of a necessity to have a Universe.
Then there’s been an emergence of new-school data discovery tools: the Tableaus, Cliqs, etc., which have done away with Semantic layers all together, and connecting into ANY data source is a breeze with most of these tools. That also means business users will need a better understanding of building queries, which arguably wasn’t necessary with a good Semantic layer. I’ll leave the question if that’s a good thing open for the moment.
So let’s line up some pros and cons. Now, depending if you have a BO or BW background, the pros might be cons, and vice versa.
- It auto-generated the Data Foundation based on the cube’s meta data (refreshing and updating changes to the Cube structure is another story, so that’s a con as well).
- It also auto-generates the Business Layer, inclusive of Folders, Dimensions, Attributes, Measures. But I’d say you don’t need 90% of that, so back to Universe best practices of ONLY building the Folders and objects which you anticipate using.
- You can define your own Dimensions, Attributes and Measures, just like any other Universe build, including combined key+text Dimensions, uprating Attributes to independent Dimensions (so you can use them as query filters).
- You can build pre-defined conditions.
- In the query panel, you can use OR conditions again! Try that with a query on a BEx query!
- You can build and manage custom navigation paths with mixed Dimensions, so you’re not limited to the ones defined in SAP and BW. Or even better, use the long or short text descriptions from your Characteristic (when re-created as a stand-alone Dimension), down to the lowest level where you could have a key+text Dimension again, say, for Customer or Material.
- Depending on if your cube is predominantly a star schema rather than a snowflake, you can change the join structure, or add shortcut joins to improve performance (more on that later with Data Federation).
- The drill-functionality will be WebI native, rather than BEx/BW inherited (in case you don’t like those (+) symbols all over your table), and for most users this will be more intuitive, or dare I say it, elegant even.
- Getting insight in the query performance and break-down into sub-queries via the Data Federation Administration tool can give great insight into queries, the way they are structured, and the way they perform (or not).
- You can build all sorts of security profiles in the Information Design Tool (IDT) which creates extra security options. It also allows for duplication of effort in maintaining your security though.
- Universe means, being able to use tools like Lumira and Explorer….. so all your SAP data exploration will be made easier!
- A BEx query will in most cases be more efficient to run against BW.
- Delegate measures will forever be painful, requiring constant refreshing of the data queries. Even though this has seen some major improvements from BI4.0 to BI4.1 (and probably even further in BI4.2), it’s still a major pain in WebI. On the other hand, all those complex Calculated Key Figures from your BEx query will always be calculated on the correct aggregation projection (or context as we call it for WebI calculations). For Universe measures though, don’t forget to change the aggregation projection from Delegated to the correct aggregation!
- Like any other Universe build: If there’s no great demand to develop a larger set of reports, the maintenance overhead of a Universe will be quite significant
- For most BW developers creating a BEx query won’t be a major task, so in most cases the build phase of a BEx query will be shorter than a Universe.
- We can’t re-use restricted and calculated key figures from any BEx query (although that may be coming back in BI4.2 for authored BEx Universes). This is actually one of the major drawbacks, as I’ve seen some very well built BEx queries from experienced BW developers which allowed using key-figures for Daily, Monthly, Month-to-Date and Year-to-Date key-figures to be used in a single WebI query, for example!
Again, all these pros and cons are dependent on your specific reporting requirements within your organisation. Within my current project I actually did have the privilege of a bit of creative freedom and tried both Universe and BEx-based report development.
From a pure report-build point of view, I still prefer the Universe. Considering there won’t be many reports built on this Universe, it was a considerable overhead. From that point of view it was definitely not the right solution. The report drill-downs look pretty cool though!
Again, create your own list of pros and cons and see how they add up for your project.
So, let’s briefly list how it all fits together without getting into to many technical details:
- In the IDT, create a secured Relational Connection to your SAP Cube using a SAP Java Connector (Infoprovider, Multiprovider, even the new Composite providers; if those are not working for you, check out this blog: http://wwwold.forefrontanalytics.com.au/2015/07/31/learnings-from-early-adoption-of-bw7-4-with-bobj-4-1/)
- Set up a local folder in your project space to start working in (stay organised)
- Import a connection shortcut into that folder
- Right-click on that folder to create a new Data Foundation (naming of Data Foundation are up to you of course, and the way you create consistent coding and naming references is definitely something you should set out within your team)
- In the dialog box to create the new Data Foundation (DF), select MultiSource-Enabled, select the connection shortcut in your folder and in the Define Connection Properties window tick the box to automatically create tables and joins; watch the magic! (it’s a great starting point for your DF, and takes a lot of work out of your hands, trust me)
- Once the DF has been created, right click on it (in your project folder) to create a New Business Layer (BL) and again, let it auto-generate to see how that process works for you
- Save all, right click on the BL and publish it to your repository
For the sake of this short blog, we’ll skip:
- Proper Universe integrity check (assuming it’s just been generated it should be in check, but always good practice to run regular checks)
- Again, normally we should use Universe best practices and only build the Folders and objects which we anticipate using. Arguably, the majority of the generated objects (in particular attributes) could be thrown out.
- We can also leave all those auto-generated objects and create new objects using @Select functions to refer to them, which would almost create a Semantic layer in our Universe to the original generated objects.
- Creating Universe “facilitators” like Lists of Values and Navigation paths (although building one custom navigation path will allow you some additional test opportunities)
So there goes: our BW Cube based Universe is ready for use! It really can be such an easy task, and we could leave the Universe as it is now. But just like restricted and calculated key-figures in BEx queries, it makes sense creating re-usable Measure objects here in the Universe rather than reinventing them in your reports all the time as formulas and variables.
Creating WebI queries works just the same as before; create a new WebI document, insert a query with a Universe as a source, select your result objects, insert a few query filters/prompts and run the query. Build the report tables/charts after that as per usual.
- You can still use a Scope of Analysis in the query to manage the query objects in your editor and drill-down options in your report
- If you created Custom Navigation Paths in your Universe, you can re-use them in the report for drill-down functionality; still works as per usual.
ANALYSING YOUR QUERY
Now, time to head to the Data Federation Administration console! Look in the query monitoring pane, and select the connection we used in the Universe. You’ll see all queries that have been run using Relational connections on that BO server, you’ll see how long they took, the Username and what SQL was run on the database. You can find some detailed info on the Data Federation Administration Tool on SCN: http://scn.sap.com/docs/DOC-55218
If the query was only using objects referring to the central fact table in the Cube, you’ll only see the one query. In most cases you’ll have used objects referring to one more Dimension and Text tables in the cube. So the query in the Administration console will show an arrow and allow you to drill into the subqueries which were actually run. Generally you will see:
- The main query is the complete query generated by WebI. This one is also used by the Data Federation Service to evaluate and plan the query execution. In the Query Monitoring panel the query will show a status of “Optimising”
- The first query that was run was the “bare” fact-based query for your aggregations: your key-figure based results
- Subsequent queries are run individually on every separate Dimension and Text table
- The above will create intermediate result sets which Data Federation manages on the Business Objects server
- The final (main) query, combining these intermediate result sets into a final result set is run last by Data Federation; it’s more or less replacing all the keys in the fact-based queries (say Material-key and Customer-key) with the text fields or descriptions which you selected as result objects in your query (say Material number/name and Customer name)
- It will show the start and end date/time of every single query, plus duration (nice to monitor how they all behaved)
In earlier tests, working on BW cubes without HANA, the SQL statements in the Administration console were more or less ANSI-SQL, so quite easy to read and analyse. When using BW on HANA, you’ll see native HANA scripts, so that’s a learning step to master that kind of script and understand what a select-column, where-filter, and aggregation function look like. Takes a bit of time. If you know HANA SQLScript, all the better.
When you look at the above behaviour, it more or less mimics BW and HANA query behaviour: Run the Fact queries using your keys, then go and replace those keys (or indexes in HANA) with their equivalent text or code parts.
One downside at this point: combining the results isn’t done on the BW or HANA server, but on the Business Objects server… and arguably this isn’t the best server to do this. So this is where most of your performance-creep will take place, and this is why the Data Federation Administration console will be your newfound best friend.
One example I came across was when pulling in the Material descriptions in the query. It ran a complete full-table query on the Material text table, over 50,000 records, and dragged them all across… not the most efficient way to do this. The text table was directly (outer) joined to the fact table in the Cube and Data Foundation, and not actually via the Material Dimension table. Which was strange, as the Dimension table had all references to the product hierarchy, had its own date fields, and important fields like Material Type. That one allowed me to create all sorts of further query filters, allowing for better efficiency. Changing the join to go via the Dimension table (and then onwards to the Fact table) changed the query order in Data Federation to:
- Still pull in the 50,000+ Material text records
- But then being forced to run another merge-query with the Dimension table first, which had a restriction on Material Type to bring the result set down to 1000 records
- These 1000 records were then included in the final query to create the final result set
Say, if your fact query would have 10,000 result records, for Data Federation to run a merge query of that results set with only 1000 Material records, rather than 50,000+ records. You’ll feel my pain, and understand how this simple yet typical Universe (shortcut) fix will make a big difference on your query and its performance!
Well, it’s pretty much up to you if you feel like running a quick test on this. Again, getting a Universe auto-generated isn’t that much of an effort. Then having the IDT as a platform for creating specific calculated measures (i.e. key-figures) is in my opinion a much more modern and flexible platform to develop from.
Not getting any restrictions in your WebI query build, and your subsequent report build and navigation are big plusses as well. In particular if you’re trying to enable business users to create their own queries and reports, I’d have a good look into this. This is still a very valid path to get to a self-serve organisation for your reporting needs.
Buyer-beware! Keep an eye on query performance. The fact that some of the queries will be run and managed by the Data Federation Service does mean we lose performance one way or another. I’ve allowed myself this luxury, now that we’re using BW on HANA. Trust me I was much more careful in doing all of this on BW on just-BW (or BW Accelerator), as this could be a double performance hit.
Finally, as I mentioned before, there’s plenty more to write about this, and much more technical detail to explore. This is heavily based on my professional opinion and preferences, coming from as I said, a Business Objects “traditionalist” understanding. From a pure performance perspective, this would never be the first choice. But it’s not just about performance, it’s also about usability and ease of use. My resistance may be futile, I know, but I still see plenty of shine in the old French-designed software for it to take on the new wave of exploration tools!