Stabilize my data types, please!
I work on a reporting system that uses Tableau for the user-facing side, and has some fun Python pandas to bring data from multiple systems together in a way that makes the reports useful. One issue I was running into is that each time I updated the csv files that are pulled into Tableau, some fields would get switched to a different data type. Then the calculations based on those columns would break because of the data type change. So much fun.
I searched online for solutions to the problem, and found that people use a schema.ini file to identify the columns of a file and their associated data types. Ok, cool, thanks InterWebs!
Wait, how do I create one? How do I upload it to Tableau? Where should it be saved? How do I make more than one because I have multiple csvs to upload? This is where my search fell short - I couldn't find a clear source for how the heck to do the whole process when the goal is a Tableau workbook.
After I had my schema.ini file up and running, I realized that the Tableau community forum has a good bit of info on it -- look there if you have any questions not covered here. I will be starting there, instead of Google, in the future!
What is schema.ini? Should I use it?
It turns out that this file is used by some database types, like Microsoft JET, to identify the columns in a file as well as the data types. Microsoft JET is what Tableau uses to import your data from a text file, so it will uses the schema.ini file if it is exists. If it doesn't exist, JET will guess at your data types.
You can even use schema.ini for fixed-width files by adding a Width indicator. Fixed-width files are such a pain, but some of the systems I work with are set up to only output these. Knowing that I could use them with Tableau is a super win for me!
If you want to be sure that your text file, whether a csv, tsv, or fixed-width, imports the same way each time to Tableau, you probably want it.
How do I create one?
Luckily, this part was pretty easy, as there are good examples on the internet. I started with a short example from a Tableau forum post
and there is more info directly from Microsoft
You can find all of the possible data types you might use on the JET wiki
Where do I save it? How do I upload it?
Save it in the same folder as your text file(s). JET will auto-magically find it, so you don't need to (and can't!) upload it to Tableau.
What do I do about multiple files in the same folder?
Since it's just one schema.ini file per folder, you will put the info for all of your text files in the same folder. The order doesn't seem to matter, but put the file name, followed by its column information. You cannot define the columns for multiple files together (i.e. no "*.txt").
Other tips
- This file has a strong effect! If you forget a quotation mark around a column name, you will end up with a field name that you didn't mean, even if that's not how it is in the csv file. It's no big deal, just go back and change it, and then refresh the data (at least in Tableau 10). This also means that you could change a column name in the schema file without changing it in the text file, whether you intend to or not.
- If you forget a column, it will not be pulled in. At first I accidentally left off the final row and it didn't appear in Tableau at all.
- If you have dates, you can specify their formats, too. The DateTime format is "yyyy-mm-dd hh:nn:ss" - note that it's all lowercase. You can, of course, specify a different format, such as "mm-dd-yyyy" if that's what you happen to have.
Top comments (0)