Dr. MetaData

Thursday, July 24, 2014

Usage Tracking Issues 10g -- still!!

OK, this may all be fixed in 11g, but I'm still a 10g guy, 4 years after that hatched. Having said that, if it isn't fixed in 11g this would have to be considered a serious sign of being asleep at the switch. If it is fixed, then maybe 11g starts to look better to me, somewhat anyway.

So here's the set of problems I see in usage tracking in 10g. You have two choices when it comes to usage tracking: either direct inserts of usage information into a table (s_nq_acct) or usage tracking information written to files (a new file every X minutes, as determined by the parameters in NQSConfig.ini).

The biggest issue with direct insert is that the QUERY_TEXT column is only varchar2(1024), which means that you're only going to record a fraction (maybe only a 10th!) of the entire logical query. If you expand it, it pretty much shuts down direct insert. So this column doesn't give you the information you need to really understand whatever issues might have arisen when the query was executed because you can't see all of it and thus you can't duplicate it and examine the physical SQL that would have been generated.

Another issue is that it doesn't differentiate whether a query is being run on behalf of a user via an iBot or whether the user is running it interactively as a dashboard/Answers user.

I should also mention another problem that tripped me up for a while until, thanks to a blog post by someone else, the light bulb finally went on. The s_nq_acct table must be defined in the repository's physical layer. The Oracle database DDL (in the OracleBI/server/schema folder) for the table includes several columns that are of type DATE. If the data types for these columns are DATE in the physical layer of the repository -- i.e. if the datatypes apparently exactly match the DDL -- direct inserts will fail. The DATE columns must be defined as DATETIME in the metadata (which they will be if you import the table, but be careful about setting it up manually).

So what happens if you take the second alternative and have the bi server write usage information to files (i.e. DIRECT_INSERT = NO in NQSConfig.ini)? Now you DO get all the logical SQL. However, the file doesn't contain all the other fields that would have been written in direct insert mode! Most importantly, QUERY_SRC_CD is null, always, instead of telling you whether a query was either a

GlobalFilter
Report
ValuePrompt
drill
Null (??)

Normally, if I'm looking at usage, I'm interested in counting only the "Report" queries. Without direct insert you can't differentiate -- you have to count everything. Not that the direct insert categorization is necessarily what you'd like if you had a choice (how about iBot being one of the query types?), but still it contains pretty essential information. Why isn't it written to the file? Why isn't what's written to the file (or directly to a table) configurable, for that matter?

For several months I used the file method rather than the direct insert method because a) direct insert stopped working [since fixed by changing DATE to DATETIME in the rpd -- see above];  and b) I really wanted the entire logical query text.

The consequence of doing this was that I periodically took all the usage files that had been generated and loaded them into a new table (S_NQ_ACCT_ADD) using SQLLDR. Doing this was a bit of a pain, but it enabled analysis of the usage data which otherwise, just sitting in a text file, would be pretty useless.

A word of advice here: change the datatype of QUERY_TEXT in the new table to CLOB. Also increase the size of the NODE_ID column from varchar2(15) to varchar2(128), or otherwise loads will fail.

The pertinent section of the sqlldr control file then says:

APPEND INTO TABLE S_NQ_ACCT_ADD
      FIELDS TERMINATED BY ";"
  TRAILING NULLCOLS

  (
    USER_NAME CHAR(128) ,
REPOSITORY_NAME CHAR(128) ,
SUBJECT_AREA_NAME CHAR(128) ,
NODE_ID CHAR(128) ,
START_TS DATE "YYYY-MM-DD HH24:MI:SS",
START_DT DATE "YYYY-MM-DD",
START_HOUR_MIN ,
END_TS DATE "YYYY-MM-DD HH24:MI:SS",
END_DT DATE "YYYY-MM-DD",
END_HOUR_MIN ,
QUERY_TEXT char(40000),
SUCCESS_FLG ,
ROW_COUNT ,
TOTAL_TIME_SEC ,
COMPILE_TIME_SEC ,
NUM_DB_QUERY ,
CUM_DB_TIME_SEC ,
CUM_NUM_DB_ROW ,
CACHE_IND_FLG CHAR(1)
)

As a CLOB, the query text can be inserted. However, the data can't be queried via OBIEE very successfully since OBIEE issues a SELECT DISTINCT, which fails with a CLOB. So you also need to load the data from this table into another table, where the query_text column is a varchar.

Yet another issue is that when the user has set a session variable, that session variable setting shows up in the logical sql, delimited from the rest of the logical SQL by a semi-colon. In other words, it looks like a separate column, since all the columns are delimited by semi-colons. SQLLDR will create a .BAD file for these which won't be inserted because of the column mis-matches.

You can load these bad records into another table (called, for example, "s_nq_acct_bad") which contains the extra column, and then in a separate SQL statement the relevant columns can be inserted later into s_nq_acct.

All in all, it's a multi-step workaround process. It really gives the impression that usage tracking wasn't given much attention by the development team.

[Further note a year later: The problems associated with writing usage to files became too much work to deal with. We went back to direct inserts despite the shortcomings.]

Wednesday, December 31, 2008

Entering Column Names In Online Mode

This problem with the admin tool is so annoying that I thought I'd mention it on this blog. When you're using the admin tool in online mode, there can be a very long delay when typing in names of new logical columns. When connecting over the Internet (e.g. via VPN) the delay can increase. So you type...and nothing happens...and then you type the same thing again because you don't see anything...and a while later you notice the additional typing was indeed redundant. Then you have to delete...and more delay. And while these delays are going on, your cursor won't appear in the logical table dialog window. You could well conclude that working with the admin tool this way just isn't practical.

Maybe someday Oracle will fix this, if they're not too busy adding new features to fix existing features. (Personally, there aren't many new features I'm dying for. And when I see them, well, they rarely bowl me over, a subject for another post. But polishing existing features would really be nice. "Just saying.")

In the meantime, here's the WORKAROUND I use: type the logical column name in the Description edit box. No delays there! Then copy and paste that into the Name edit box. There aren't any delays when you do that, either. This works great -- until you forget, which you inevitably do.

Oracle, if you're reading this, here's another easy thing to fix: make Ctrl-A (select all) work in every edit box window -- with highest priority on the edit box where you enter the SQL for physical layer SQL objects.

Thursday, October 30, 2008

Time Series Functions

BI EE has the time series functions Ago() and ToDate(). Have you ever wondered how these work? The best way to find out is to examine the SQL that is generated when a query includes these functions. The structure of the SQL is determined by the database features supported by the target database. This presentation displays what happens in the case of Oracle 10gR2 (and 11g), which are "best case" databases for these functions: How Time Series Functions Work.

Tuesday, May 13, 2008

Leaving Oracle

In April, I left Oracle and began working as the BI Strategy Lead for a company that intends to use OBI EE extensively in a SaaS (Software as a Service) model.

Today I deleted all the previous posts on this blog which I put up while I was working for Oracle. So the blog is now a clean slate.

As the universe of OBI EE users grows, there is an ever-increasing amount of activity in the "blogosphere" about the product. When I read these blogs, I have various reactions: some are very good, some information is wrong (even in the very good blogs), some people are writing about OBI EE who seem to have very limited backgrounds with the product. I find some postings to be amusing -- particularly those that frame their information as great insights, but which would have been covered in the first half day of normal OBI EE training. Oh, well.

I intend to keep this blog centered on metadata and modeling techniques, with occasional forays into useful web ui techniques. Also, as a non-Oracle employee, I might feel a little freer about talking about product "idiosyncracies" -- the things about the product that drive me crazy. (Yes, there are some.) We'll see. I was a part of the product team since the very beginning, when the product was still on the drawing board in 1998, and so have a lot of affection for the product and the great group of people that developed it. In my opinion, it's still clearly the best BI platform available by a long shot, and in my new position I'm excited about the possibilities of using it to drive new business for the company I now work for. Whatever insights about the product that I develop while in this capacity, I'll post here. Stay posted.