DEV Community

loading...

How to explore ClickHouse data via Excel PivotTable using Mondrian

sergeisemenkov profile image Sergei Semenkov Updated on ・5 min read

ClickHouse is a very powerful OLAP engine. Microsoft Excel is one of the world's most popular and powerful business programs. There are several ways to bring ClickHouse data to Excel spreadsheets.

One way to do this is using ClickHouse ODBC driver. In order to do this you have to install the ODBC driver and create ClickHouse data source in Excel. Then you can explore data as tables or you can run a query on ClickHouse and browse results.

The other way to explore ClickHouse data in Excel is using Mondrian. This approach allows you to describe ClickHouse data as a multidimensional model and give access to this model through XMLA standart. It also allows client applications to run MDX queries on ClickHouse data. You represent ClickHouse data as cubes and dimensions and then browse those cubes in Excel PivotTable.

This article describes how to install Mondrian, create a Mondrian datasource for ClickHouse database and connect Excel PivotTable to this datasource.

We will use the eMondrian fork of the Mondrian ROLAP engine. The main feature of this version is support of ADOMD.NET and Excel clients. eMondrian already includes ClickHouse driver.

Mondrian can be hosted on different java servers (Apache Tomcat, Jetty, Tiny Java Web Server). This example describes the case with Ubuntu operating system and Apache Tomcat java server.

In this example Tomcat is installed in /opt/tomcat/latest/ directory.

You have to change the following command to your Tomcat path.

export TOMCATDIR="/opt/tomcat/latest"
Enter fullscreen mode Exit fullscreen mode

Next commands download the Mondrian archive, copy it to Tomcat directory and restart Tomcat java server.

sudo wget https://github.com/SergeiSemenkov/eMondrian/archive/main.zip
sudo unzip -j main.zip eMondrian-main/dist/emondrian.war -d $TOMCATDIR/webapps
sudo systemctl restart tomcat
Enter fullscreen mode Exit fullscreen mode

Now, let's connect Excel to the Mondrian demo Foodmart database.

In order to get address to your Mondrian XMLA service you have to insert your Ubuntu server ip in following template
http://<your_server_address>:8080/emondrian/xmla
To get your server ip, run command

hostname -I
Enter fullscreen mode Exit fullscreen mode

In this example Mondrian server address is
http://192.168.100.4:8080/emondrian/xmla

Open Excel, select the Data tab, then Get Data -> From Database -> From Analysis Services.

Paste address of Mondrian XMLA service as Server name and click Next.

Then choose a cube and click Finish. The PivotTable with cube data will appear.

Next, we will create a Mondrian schema for ClickHouse example dataset OnTime. The schema defines two cubes. Cube OnTime is based on data from ontime table. Cube ViewOnTime is an example of how to create a cube based on a query instead of a table.

Following command will create new schema file OnTime.xml.

sudo nano $TOMCATDIR/webapps/emondrian/WEB-INF/schema/OnTime.xml
Enter fullscreen mode Exit fullscreen mode

Copy xml to this file and press Ctrl-S (Save) and Ctrl-X(Exit).

<?xml version="1.0"?>
<Schema name="OnTime">
<Cube name="OnTime">
  <Table name="ontime"/>
  <Dimension name="Date">
    <Hierarchy hasAll="true" allMemberName="All Dates">
      <Level name="Year" column="Year" uniqueMembers="true" type="Numeric"/>
      <Level name="Quarter" column="Quarter" uniqueMembers="true" type="Numeric">
        <CaptionExpression>
          <SQL dialect="generic">concat(toString(Quarter), ' ', toString(Year))</SQL>
        </CaptionExpression>
      </Level>
      <Level name="Month" column="Month" uniqueMembers="true" type="Numeric">
        <CaptionExpression>
<SQL dialect="generic">
concat(          
CASE Month WHEN 1 THEN 'January'
WHEN 2 THEN 'February'
WHEN 3 THEN 'March'
WHEN 4 THEN 'April'
WHEN 5 THEN 'May'
WHEN 6 THEN 'June'
WHEN 7 THEN 'July'
WHEN 8 THEN 'August'
WHEN 9 THEN 'September'
WHEN 10 THEN 'October'
WHEN 11 THEN 'November'
WHEN 12 THEN 'December'
ELSE 'Unknown' END,
' ',
toString(Year))
</SQL>
        </CaptionExpression> 
      </Level> 
      <Level name="Day" column="DayofMonth" uniqueMembers="true">
        <CaptionExpression>
          <SQL dialect="generic">concat(toString(Year), '-', toString(Month), '-', toString(DayofMonth))</SQL>
        </CaptionExpression>
      </Level>
    </Hierarchy>
  </Dimension>
  <Dimension name="Unique Carrier">
    <Hierarchy hasAll="true" allMemberName="All Unique Carriers">
      <Level name="Unique Carrier" uniqueMembers="true">
        <KeyExpression>
          <SQL dialect="generic">toStringCutToZero(UniqueCarrier)</SQL>
        </KeyExpression>
      </Level>
    </Hierarchy>
  </Dimension>
  <Dimension name="Destination">
    <Hierarchy hasAll="true" allMemberName="All Destinations">
      <Level name="Destination State Name" column="DestStateName" uniqueMembers="true"/>
      <Level name="Destination City" column="DestCityName" uniqueMembers="true"/>
    </Hierarchy>
  </Dimension>
  <Dimension name="Origin">
    <Hierarchy hasAll="true" allMemberName="All Origins">
      <Level name="Origin State Name" column="OriginStateName" uniqueMembers="true"/>
      <Level name="Origin City" column="OriginCityName" uniqueMembers="true"/>
    </Hierarchy>
  </Dimension>
  <Measure name="Flights" column="Flights" aggregator="sum" formatString="#,###"/>
  <Measure name="Air Time" column="AirTime" aggregator="sum" formatString="#,###"/>
  <Measure name="Distance" column="Distance" aggregator="sum" formatString="#,###"/>
  <Measure name="Dep Delay" column="DepDelay" aggregator="sum" formatString="#,###"/>
  <Measure name="Carrier Delay" column="CarrierDelay" aggregator="sum" formatString="#,###"/>
  <Measure name="Weather Delay" column="WeatherDelay" aggregator="sum" formatString="#,###"/>
</Cube>
<Cube name="ViewOnTime">
<View alias="ViewOnTime">
<SQL dialect="generic">
<![CDATA[select * from ontime]]>
</SQL>
</View>
  <Dimension name="Destination City Name" foreignKey="DestCityName">
    <Hierarchy hasAll="true" allMemberName="All Destinations">
      <Level name="Destination City Name" column="DestCityName" uniqueMembers="true"/>
    </Hierarchy>
  </Dimension>
  <Dimension name="Origin City Name" foreignKey="OriginCityName">
    <Hierarchy hasAll="true" allMemberName="All Origin Cities">
      <Level name="Origin City Name" column="OriginCityName" uniqueMembers="true"/>
    </Hierarchy>
  </Dimension>
  <Measure name="Dep Delay" column="DepDelay" aggregator="sum" formatString="#,###"/>
</Cube>
</Schema>
Enter fullscreen mode Exit fullscreen mode

You have to give rights to this file to the user under which Tomcat server is running. In the following command change the user tomcat:tomcat name to the name under which your Tomcat server works and run it.

sudo chown -RH tomcat:tomcat $TOMCATDIR/webapps/emondrian/WEB-INF/schema/OnTime.xml
Enter fullscreen mode Exit fullscreen mode

Now, we will edit the Mondrian datasource configuration file. Run following command.

sudo nano $TOMCATDIR/webapps/emondrian/WEB-INF/datasources.xml
Enter fullscreen mode Exit fullscreen mode

Replace the DataSources tag with following xml.

<DataSources>
  <DataSource>
    <DataSourceName>OnTime</DataSourceName>
    <DataSourceDescription>ClickHouse Sample Data</DataSourceDescription>
    <URL>http://localhost:8080/emondrian/xmla</URL>
    <DataSourceInfo>Provider=mondrian;Jdbc=jdbc:clickhouse://192.168.100.8:8123/datasets;JdbcDrivers=ru.yandex.clickhouse.ClickHouseDriver</DataSourceInfo>
    <ProviderName>Mondrian</ProviderName>
    <ProviderType>MDP</ProviderType>
    <AuthenticationMode>Unauthenticated</AuthenticationMode>
    <Catalogs>
        <Catalog name="OnTime">
            <Definition>/WEB-INF/schema/OnTime.xml</Definition>
        </Catalog>
    </Catalogs>
  </DataSource>
</DataSources>
Enter fullscreen mode Exit fullscreen mode

Edit connection string to your ClickHouse dataset in DataSourceInfo tag using following syntax:
jdbc:clickhouse://<host>:<port>[/<database>][?user=<my_user>&password=<my_password>]

Save (Ctrl-S) and close (Ctrl-X) datasources file.
Restart tomcat server.

sudo systemctl restart tomcat
Enter fullscreen mode Exit fullscreen mode

Create a new connection to Mondrian server in Excel to see cubes from your new schema. Select one to browse its data in PivotTable.

Try and create your one Mondrian schema for ClickHouse datasets (documentation).

Discussion (0)

pic
Editor guide