Some of the Mondrian classes are instrumented with Apache Log4J Loggers. For some of these classes there are certain logging setting that provide information for not just the code developer but also for someone setting up a Mondrian installation. The following is a list of some of those log setting and the associated information.
Category |
Level |
Description |
mondrian.rolap.aggmatcher.AggTableManager
|
INFO |
A list of the RolapStar fact table names (aliases) and for each fact table, a list of all of its associated aggregate tables. |
mondrian.rolap.aggmatcher.AggTableManager
|
DEBUG |
A verbose output of all RolapStar fact tables, their measures columns, and dimension tables and columnns, along with all of each fact table's aggregate tables, columns and dimension tables. |
mondrian.rolap.aggmatcher.DefaultDef
|
DEBUG |
For each candidate aggregate table, the Matcher regular expressions for matching: table name and the fact count, foreign key, level and measure columns. Helpful in finding out why an aggregate table was not recognized. |
mondrian.rolap.agg.AggregationManager
|
DEBUG |
For each aggregate Sql query, if an aggregate table can be used to fulfill the query, which aggregate it was along with bitKeys and column names. |
mondrian.rolap.RolapUtil
|
DEBUG |
Prints out all Sql statements and their execution time.
If one set the Mondrian property,
mondrian.rolap.generate.formatted.sql
to true, then the Sql is pretty printed (very nice).
|
mondrian.rolap.RolapConnection
|
DEBUG |
Prints out each MDX query prior to its execution. (No pretty printing, sigh.) |
mondrian.rolap.RolapSchema
|
DEBUG |
Prints out each Rolap Schema as it is being loaded. |
There are more classes with logging, but their logging is at a lower, more detailed level of more use to code developers.
Log levels can be set in either a log4j.properties file or log4j.xml file. You have to make sure you tell Mondrian which one to use. For the log4j.properties, entries might look like:
log4j.category.mondrian.rolap.RolapConnection=DEBUG
log4j.category.mondrian.rolap.RolapUtil=DEBUG
while for the log4.xml:
<category name="mondrian.rolap.RolapConnection">
<priority value="DEBUG"/>
</category>
<category name="mondrian.rolap.RolapUtil">
<priority value="DEBUG"/>
</category>
The default Mondrian rules for recognizing aggregate tables
are specified by creating an instance of the rule schema
found in the file:
MONDRIAN_HOME/src/main/rolap/aggmatcher/DefaultRulesSchema.xml.
The instance of this schema that is built into the
mondrian.jar
after a build is in the same directory,
MONDRIAN_HOME/src/main/rolap/aggmatcher/DefaultRules.xml.
There are six different default rules that are used to match and map
a candidate
aggregate table: table name, ignore column, fact count column, foreign
key column, level column and measure column. All of these rules are
defined by creating an instance of the DefaultRulesSchema.xml grammar.
The DefaultRulesSchema.xml instance, the DefaultRules.xml file mentioned
above, that by default is built as part of the mondrian.jar does not
contain an ignore column rule.
This grammar has base/supporting classes that are common to the above
rules. In
XOM
terms, these are classes and super classes of the rule elements.
The first
XOM
class dealing with matching is the
CaseMatcher
class.
This has an attribute "charcase" that takes the legal values of
"ignore" (default)
"exact"
"upper"
"lower"
When the value of the attribute is "ignore", then the regular
expression formed by an element extending the
CaseMatcher
class will be case independent for both any parameters used
to instantiate the regular expression template as well as for
the text in the post-instantiated regular expression.
On the other hand, when the "charcase" attribute take any of
the other three values, it is only the parameter values themselves
that are "exact", unchanged, "lower", converted to lower case, or
"upper", converted to upper case.
The class
NameMatcher
extends the
CaseMatcher
class. This class
has pre-template and post-template attributes whose default values
is the empty string. These attributes are prepended/appended
to a parameter to generate a regular expression.
As an example, the
TableMatcher
element extends
NameMatcher
class.
The parameter in this case is the fact table name and the regular
expression would be:
pre-template-attribute${fact_table_name}post-template-attribute
For Mondrian, the builtin rule has the pre template value "agg_.+_" and the post template attribute value is the default so the regular expression becomes:
agg_.+_${fact_table_name}
Also, the
NameMatcher
has an attribute called
basename
which is optional. If set, then its value must be a regular expression
with a single capture group. A capture group is an regular expression
component surrounded by "(" and ")". As an example, "(.*)" is
a capture group and if this was the total regular expression, then
it would match anything and the single capture would match the same.
On the other hand if the total regular expression was "RF_(.*)_TBL", then
a name such as "RF_SHIPPMENTS_TBL" would match the regular expression while
the capture group would be "SHIPPMENTS".
Now, if the
basename
attribute is defined, then it is applied to each fact table name allowing
one to strip away information and get to the "base" name. This might
be needed because a DBA might prepend or append a tag to all of
your fact table names and the DBA might wish to have a different tag
prepend or append to all of your aggregate table names (RF_SHIPPMENTS_TBL
as the fact table and RA_SHIPPMENTS_AGG_14 as an example aggregate
name (the DBA prepended the "RA_" and you appended the "_AGG_14")).
Both the
FactCountMatch
and
ForeignKeyMatch
elements also extend the
NameMatcher
class. In these cases, the builtin Mondrian rule has no
pre or post template attribute values, no regular expression,
The
FactCountMatch
takes no other parameter from the fact table (the fact table does not
have a fact count column) rather it takes a fact count attribute
with default value "fact_count", and this is used to
create the regular expression.
For the
ForeignKeyMatch
matcher, its the fact table's foreign key that is used
as the regular expression.
The ignore, asdf level and measure column matching elements have one or more
Regex
child elements. These allow for specifying multiple possible matches
(if any match, then its a match).
The
IgnoreMap,
LevelMap
and
MeasureMap
elements extend the
RegexMapper
which holds an array of
Regex
elements.
The
Regex
element extends
CaseMatcher
It has two attributes,
space
with default value '_' which says how space characters should be mapped,
and
dot
with default value '_' which says how '.' characters should be mapped.
If a name were the string "Unit Sales.Case" then (with the default
values for the
space
and
dot
attributes and with
CaseMatcher
mapping to lower case
) this would become "unit_sales_case".
The
IgnoreMap
element has NO template parameter names. Each
Regex
value is simply a regular expression. As an example (Mondrian by
default does not include an
IgnoreMap
by default), a regular expression that matches all aggregate table columns
then end with '_DO_NOT_USE' would be:
.*_DO_NOT_USE
One might want to use an
IgnoreMap
element to filter out aggregate columns if, for example, the aggregate
table is a materialized view, since with each "normal" column
of such a materialized view there is an associated support column
used by the database which has no significance to Mondrian.
In the process of recognizing aggregate tables, Mondrian logs
a warning message for each column whose use can not be determined.
Materialized views have so many of these support columns
that if, in fact, there was a column whose use was desired
but was not recognized (for instance, the column name is
misspelt) all of the materialized view column warning message
mask the one warning message that one really needs to see.
The
IgnoreMap
regular expressions are applied before any of the other column
matching actions. If one sets the
IgnoreMap
regular expression to, for example,
.*
then all columns are marked as "ignore" and there are no other
columns left to match anything else. One must be very careful
when choosing
IgnoreMap
regular expressions not just for your current columns but for
columns that might be created in the future. Its best to document
this usage in your organization.
The following is what the element might look like in a DefaultRules.xml file:
<IgnoreMap id="ixx" > <Regex id="physical" charcase="ignore"> .*_DO_NOT_USE </Regex> </IgnoreMap>
The
LevelMap
element has the four template parameter names (hardcoded):
hierarchy_name
level_name
level_column_name
usage_prefix
These are names that can be used in creating template regular
expressions.
The builtin Mondrian default rules for level matching defines
three
Regex
child elements for the
LevelMap
element.
These define the template regular expressions:
${hierarchy_name}_${level_name}
${hierarchy_name}_${level_column_name}
${usage_prefix}${level_column_name}
${level_column_name}
Mondrian while attempting to match a candidate aggregate table against a particular fact table, iterates through the fact table's cube's hierarchy name, level name and level colum names looking for matches.
The
MeasureMap
element has the three template parameter names (hardcoded):
measure_name
measure_column_name
aggregate_name
which can appear in template regular expressions.
The builtin Mondrian default rules for measure matching defines
three
Regex
child elements for the
MeasureMap
element.
These are
${measure_name}
${measure_column_name}
${measure_column_name}_${aggregate_name}
and Mondrian attempts to match a candidate aggregate table's column names against these as it iterators over a fact table's measures.
A grouping of
FactCountMatch
,
ForeignKeyMatch
,
TableMatcher
,
LevelMap
,
and
MeasureMap
make up a
AggRule
element, a rule set.
Each
AggRule
has a
tag
attribute which is a unique identifier for the rule.
There can be multiple
AggRule
elements in the outer
AggRules
element.
Each
AggRule
having its own
tag
attribute.
When Mondrian runs, it selects (via the
mondrian.rolap.aggregates.rule.tag
property) which rule set to use.
One last wrinkle, within a
AggRule
the
FactCountMatch
,
ForeignKeyMatch
,
TableMatcher
,
LevelMap
,
and
MeasureMap
child elements can be either defined explicitly within the
AggRule
element or by reference
FactCountMatchRef
,
ForeignKeyMatchRef
,
TableMatcherRef
,
LevelMapRef
,
and
MeasureMapRef
The references are defined as child elements of the top level
AggRules
element. With references the same rule element
can be used by more than one
AggRule
(code reuse).
Below is an example of a default rule set with rather different matching rules.
<AggRules tag="your_mamas_dot_com"> <AggRule tag="default" > <FactCountMatch id="fca" factCountName="FACT_TABLE_COUNT" charcase="exact" /> <ForeignKeyMatch id="fka" pretemplate="agg_" /> <TableMatch id="ta" pretemplate="agg_" posttemplate="_.+"/> <LevelMap id="lxx" > <Regex id="logical" charcase="ignore" space="_" dot="_"> ${hierarchy_name}_${level_name} </Regex> <Regex id="mixed" charcase="ignore" > ${hierarchy_name}_${level_name}_${level_column_name} </Regex> <Regex id="mixed" charcase="ignore" > ${hierarchy_name}_${level_column_name} </Regex> <Regex id="usage" charcase="exact" > ${usage_prefix}${level_column_name} </Regex> <Regex id="physical" charcase="exact" > ${level_column_name}_.+ </Regex> </LevelMap> <MeasureMap id="mxx" > <Regex id="one" charcase="lower" > ${measure_name}(_${measure_column_name}(_${aggregate_name})?)? </Regex> <Regex id="two" charcase="exact" > ${measure_column_name}(_${aggregate_name})? </Regex> </MeasureMap> </AggRule> </AggRules>
First, all fact count columns must be called
FACT_TABLE_COUNT
exactly, no ignoring case.
Next, foreign key columns match the regular expression
agg_${foreign_key_name}
that is, the fact table foreign key column name with "agg_" prepened such
as
agg_time_id
.
The aggregate table names match the regular expression
agg_${fact_table_name}_.+
For the FoodMart
sales_fact_1997
fact table, an aggregate could be named,
agg_sales_fact_1997_01
agg_sales_fact_1997_lost_time_id
agg_sales_fact_1997_top
If the hierarchy, level and level column names were:
hierarchy_name="Sales Location"
level_name="State"
level_column_name="state_location"
usage_prefix=null
then the following aggregate table column names would be recognizing as level column names:
SALES_LOCATION_STATE
Sales_Location_State_state_location
state_location_level.
If in the schema file the DimensionUsage for the hierarchy had a usagePrefix attribute,
usage_prefix="foo_"
then with the above level and level column names and usage_prefix the following aggregate table column names would be recognizing as level column names:
SALES_LOCATION_STATE
Sales_Location_State_state_location
state_location_level.
foo_state_location.
In the case of matching measure columns, if the measure template parameters have the following values:
measure_name="Unit Sales"
measure_column_name="m1"
aggregate_name="Avg"
then possible aggregate columns that could match are:
unit_sales_m1
unit_sales_m1_avg
m1
m1_avg
The intent of the above example default rule set is not that they are necessarily realistic or usable, rather, it just shows what is possible.
Mondrian supports dimensions with all of their levels lumped into a single table (with all the duplication of data that that entails), but also snowflakes. A snowflake dimension is one where the fact table joins to one table (generally the lowest) and that table then joins to a table representing the next highest level, and so on until the top level's table is reached. For each level there is a separate table.
As an example snowflake, below is a set of Time levels and four possible join element blocks, relationships between the tables making up the Time dimension. (In a schema file, the levels must appear after the joins.)
<Level name="Calendar Year" table="TimeYear" column="YEAR_SID" nameColumn="YEAR_NAME" levelType="TimeYears" uniqueMembers="true"/> <Level name="Quarter" table="TimeQtr" column="QTR_SID" nameColumn="QTR_NAME" levelType="TimeQuarters" uniqueMembers="true"/> <Level name="Month" table="TimeMonth" column="MONTH_SID" nameColumn="MONTH_ONLY_NAME" levelType="TimeMonths" uniqueMembers="false"/> <Level name="Day" table="TimeDay" column="DAY_SID" nameColumn="DAY_NAME" levelType="TimeDays" uniqueMembers="true"/> <Join leftAlias="TimeYear" leftKey="YEAR_SID" rightAlias="TimeQtr" rightKey="YEAR_SID" > <Table name="RD_PERIOD_YEAR" alias="TimeYear" /> <Join leftAlias="TimeQtr" leftKey="QTR_SID" rightAlias="TimeMonth" rightKey="QTR_SID" > <Table name="RD_PERIOD_QTR" alias="TimeQtr" /> <Join leftAlias="TimeMonth" leftKey="MONTH_SID" rightAlias="TimeDay" rightKey="MONTH_SID" > <Table name="RD_PERIOD_MONTH" alias="TimeMonth" /> <Table name="RD_PERIOD_DAY" alias="TimeDay" /> </Join> </Join> </Join> <Join leftAlias="TimeQtr" leftKey="YEAR_SID" rightAlias="TimeYear" rightKey="YEAR_SID" > <Join leftAlias="TimeMonth" leftKey="QTR_SID" rightAlias="TimeQtr" rightKey="QTR_SID" > <Join leftAlias="TimeDay" leftKey="MONTH_SID" rightAlias="TimeMonth" rightKey="MONTH_SID" > <Table name="RD_PERIOD_DAY" alias="TimeDay" /> <Table name="RD_PERIOD_MONTH" alias="TimeMonth" /> </Join> <Table name="RD_PERIOD_QTR" alias="TimeQtr" /> </Join> <Table name="RD_PERIOD_YEAR" alias="TimeYear" /> </Join> <Join leftAlias="TimeMonth" leftKey="MONTH_SID" rightAlias="TimeDay" rightKey="MONTH_SID" > <Join leftAlias="TimeQtr" leftKey="QTR_SID" rightAlias="TimeMonth" rightKey="QTR_SID" > <Join leftAlias="TimeYear" leftKey="YEAR_SID" rightAlias="TimeQtr" rightKey="YEAR_SID" > <Table name="RD_PERIOD_YEAR" alias="TimeYear" /> <Table name="RD_PERIOD_QTR" alias="TimeQtr" /> </Join> <Table name="RD_PERIOD_MONTH" alias="TimeMonth" /> </Join> <Table name="RD_PERIOD_DAY" alias="TimeDay" /> </Join> <Join leftAlias="TimeDay" leftKey="MONTH_SID" rightAlias="TimeMonth" rightKey="MONTH_SID" > <Table name="RD_PERIOD_DAY" alias="TimeDay" /> <Join leftAlias="TimeMonth" leftKey="QTR_SID" rightAlias="TimeQtr" rightKey="QTR_SID" > <Table name="RD_PERIOD_MONTH" alias="TimeMonth" /> <Join leftAlias="TimeQtr" leftKey="YEAR_SID" rightAlias="TimeYear" rightKey="YEAR_SID" > <Table name="RD_PERIOD_QTR" alias="TimeQtr" /> <Table name="RD_PERIOD_YEAR" alias="TimeYear" /> </Join> </Join> </Join>
Viewed as trees these can be represented as follows:
| --------------- | | Year -------------- | | Quarter --------- | | Month Day | ---------------- | | -------------- Year | | --------- Quarter | | Day Month | ---------------- | | -------------- Day | | --------- Month | | Year Quarter | --------------- | | Day -------------- | | Month --------- | | Quarter Year
It turns out that these join block are equivalent; what table joins to what other table using what keys. In addition, they are all (now) treated the same by Mondrian. The last join block is the canonical representation; left side components are levels of greater depth than right side components, and components of greater depth are higher in the join tree than those of lower depth:
| --------------- | | Day -------------- | | Month --------- | | Quarter Year
Mondrian reorders these join blocks into the canonical form and uses that to build subtables in the RolapStar.
In addition, if a cube had a
DimensionUsage
of this Time dimension with, for example, its
level
attribute set to Month, then the above tree is pruned
| -------------- | | Month --------- | | Quarter Year
and the pruned tree is what is used to create the subtables in the RolapStar. Of course, the fact table must, in this case, have a MONTH_SID foreign key.
Note that the
Level
element's table attribute MUST use the table alias and NOT the table name.
With Java5, developers using its memory monitoring capabilities
need to make sure the code they create will best use this new feature.
In particular, if a given algorithm which uses significant memory
is surrounded by
block in which a MemoryMonitor.Listener
has been
registered with the MemoryMonitor
, then the code
must periodically check if a memory notification has occurred.
If the algorithm has long stretches of allocating memory for
data structures that will exist throughout the life-time of the
algorithm's execution during which it does not check for
memory notifications, then it is possible that an
OutOfMemoryError
could still occur.
You can see for the ResultSet object where, basically, all memory
is created in its constructor, throughout the Member determination
and value evaluation code, the Query object's checkCancelOrTimeout
method is called repeatedly.
The Java5 memory management mechanism is not fool proof, so to speak.
If one, as an example, attempts to allocate a very big
array, an OutOfMemoryError
will occur. This
technique works best when memory is allocated incrementally between
checks for memory notifications allowing the developer to
take steps before a possible OOME gotterdammerung.
One last issue, if a developer needs to embed Mondrian in
a Web or Application server and the server has its own
way of dealing with Java5 memory notification system, then
it is important that Mondrian be a good application citizen
in the server. It is much like the use of JAAS in an
application. A JVM allows for a single JAAS instance and
most servers register their mechanism with the JVM.
It is bad for the application in the server to use its own
JAAS rather than register with the server's. So, if
Mondrian is in a Web or Application server that has its own
dealings with the Java5 memory notification system and
the server expects applications to use its mechanism, then the developer
must create an instance of the MemoryManager
interface
that communicates the Webserver/Appserver mechanism and
uses a System property to instruct the MemoryManagerFactory
to create that specialized version.
The developer can create their own Roles by
implementing the Role
interface or by taking an
existing Role instance and wrapping it in an object
derived from and overriding some of the
methods of the DelegatingRole
and DelegatingRole.HierarchyAccess
classes.
In both cases, some care must be taken not to stray too
far from the semantics of the default Mondrian Role
implementation, the RoleImpl
class.
When implementing one's own Role
the Role
interface has methods that return an
Access
object for
Schema
,
Cube
,
Dimension
,
Hierarchy
,
Level
,
Member
and
NamedSet
all of which must have implementations.
One reason one might wish to create one's own Role
implementations
is to avoid defining Roles in the Schema
definition. This allows
the Mondrian container to dynamically generate new Roles while using the
same Schema
definitio nand, therefore,
the same in-memory caches associated with that Schema
object.
Such Roles do not need to be registered with the Schema
object;
they are associated with the Connection
.
Another reason one might wish to implement one's own Roles is that
there might be an existing permission system and, rather than
have duplicate information: in the permission system and the Schema
definition, one simply creates Roles based upon permission system
information.
If one wishes simply to alter or extend the semantics of the
existing Role
implementation, the RoleImpl
class, then
using the DelegatingRole
class is a reasonably
utilitarian approach. This requires that one create a Role
implementation derived from the DelegatingRole
class,
in the Mondrian container call the Schema.lookupRole(String)
method to get the Role
whose semantics are to be modified, create
an instance of the Role derived from the DelegatingRole
that wraps the underlying Role
, and, finally, set the
Connection
's Role
by calling
Connection.setRole(Role)
with this wrapping Role
.
The following code is an example where the underlying Role
is wrapped in a class that extends the DelegatingRole
class.
Here, the user has no access to the store where "Joe Bob" is the
manager.
public class RoleExample extends DelegatingRole {
.....
public static class HierarchyAccessExample
extends DelegatingRole.HierarchyAccess {
.....
public Access getAccess(Member member) {
Access access = hierarchyAccess.getAccess(member);
return getAccess(member, access);
}
.....
}
.....
public Access getAccess(Member member) {
Access access = role.getAccess(member);
return getAccess(member, access);
}
.....
// no one see's information about the store where "Joe Bob" is manager.
protected Access getAccess(Member member, Access access) {
final String storeNamelevel =
"[Store].[Store Country].[Store State].[Store City].[Store Name]";
if (member.getLevel().getUniqueName().equals(storeNamelevel)) {
Object o = member.getProperty("Store Manager");
return (o != null && o.equals("Joe Bob")) ? Access.NONE : access;
} else {
return access;
}
}
}
In this case, special care must be taken if one over rides one of the
methods that yield a Member
's Access
.
This is because there are two such methods. The first is in the
Role
itself Role.getAccess(Member)
and the second is Role.HierarchyAccess.getAccess(Member)
.
Internally, Mondrian is certain code paths calls one of the methods
while in other code paths it calls the other, thus they should be
overridden in a consistent manner.
Author: Julian Hyde, Richard Emberson; last updated August, 2006.
Version: $Id: //open/mondrian/doc/developer_notes.html#12 $
(log)
Copyright (C) 2005-2007 Julian Hyde