top of page

Creating Useful Documentation

The following blog by Jacqui Moore was originally published on Do Mo(o)re With Data January 26, 2023 and is cross-posted here with permission. Jacqui Tableau Social Ambassador and a Senior Data Analytics and Viz Consultant for Cleartelligence.


I’ve written a lot of documentation, and it’s a task that few people enjoy, and I am no exception. I’ve also read a lot of documentation and unwound a lot of undocumented reporting, and it’s a task that is often overlooked and underappreciated. Good documentation can be invaluable in maintainability, training, and knowledge transfer. I’ve certainly come back to a project I worked on six months later, and forgotten why something was done a certain way. Or dug my way through Tableau Workbooks and ETL code to find out where a certain piece of logic is coming from.


I find the most useful documentation in day-to-day work is the documentation that is right where you need it. Making documentation part of your workflow can save your own sanity, and pay dividends in time saved, either your own or whoever inherits your work.

This isn’t to say that a full technical document isn’t helpful or needed. These provide invaluable information on the business context, interactivity, use cases, logic, and more. However, documenting your process right in the tool where you are working will save immense amounts of time and confusion down the line, are easier to keep up-to-date, and you can even use this information at the end of a project to make the creation of technical documents and user documents easier.


So, where does this living documentation, well, live? Ideally, it lives everywhere the data is touched. Keep in mind, this type of documentation isn’t meant to be redundant, but to add context that isn’t immediately apparent.


Data Prep Stage

What to Include

  • Authors

  • Date created or modified

  • Designed purpose and limitations of the data source

  • Data lineage and dependencies. If you’re using a well-used database, it may not be as important as if you are connecting to spreadsheets or processes that need to be updated or maintained.

  • Data freshness timestamp, if applicable

  • Call out any inclusion/exclusion criteria, transformations done, business decisions, or logic explanations

Ensure this makes it to the users of the data! If they won’t open the workflow or see the SQL, then passing this information downstream is key. Some tools, like dbt’s “Exposures”, include features to surface this type of information to others.

Ways to Document


Commenting and annotating code and workflows is helpful to quickly orient yourself or others on what is happening, where, and why. Naming conventions for fields, subqueries, views, etc. will also go a long way.


Using a very simple example based on the Superstore data set, I’ve shown some ways to document data preparation below. Most of the queries and workflows we create will be much more complex than this, so documentation becomes more important. For this example, I used Ken Flerlage’s SQL Server Superstore instance. If you need a server to connect to for learning how to use data prep tools, check out his post on FlerlageTwins.com!


How this might look in SQL:

  • Comments should clarify any changes, logic that may be seen as unneeded or is unclear in purpose

  • Aliases should be easy to identify

  • Clean formatting to allow easy reading to locate key information

  • All fields are prefixed with the source table alias

/* 
Author: Jacqui Moore
Date Created: 2023-01-19
Purpose: All Orders and returns for West Region
Modified: 2023-01-20 Ticket ABC-123
*/

SELECT 
	 o.[Order ID]
	,o.[Order Date]
	,o.[Ship Date]
	,o.[Customer ID]
	,o.[Customer Name]
	,o.Segment
	,o.[Product ID]
	,o.[Product Name]
	,o.[Category]
	,o.[Sub-Category]
	,o.Sales as [Amount Sold]
	,o.Quantity as [Quantity Sold]
--	,o.Discount as [Discount as Sold]  --Removed per ABC-123
--	,o.Profit as [Profit as Sold] -- Removed per ABC-123
        ,r.Returned
FROM 
	SuperstoreUS.dbo.Orders o
LEFT JOIN 
	SuperstoreUS.dbo.[Returns] r
	ON r.[ORDER ID]=o.[ORDER ID]
WHERE 
	o.Region = 'West'

How this might look in Alteryx:

  • Comment header to indicate name, purpose, creator, and important information about a workflow

  • Containers can be used to create a “Read Me” for additional information

  • Tools are annotated descriptively

  • Calculated fields are commented with assumptions, or additional context the next person needs to know

  • Containers are used to segment the steps and provide additional context on the processing of the data

How this might look in Tableau Prep

  • While there are fewer ways to add notes with Tableau Prep, you can add a description to each step

  • Groups can be used to create a cleaner flow, with the ability to drill in on steps, and act similar to Alteryx containers in some ways

  • Calculations can be commented using // at the start of a comment line

Other helpful things to include

  • If you’ve used a macro, tool group, or snippet of code from somewhere else, include a link to the original source

  • If you’ve used a macro or tool group, include a brief description of the purpose and what operations are being performed

Visualization Stage

On the Data Source

  • Give your Tableau Data Source a descriptive name

  • Pre-filter any data in the data source, whenever possible

  • Rename the tables, if the names aren’t clear

  • If you are using Custom SQL, comment that code

  • If the data source is published, a description containing some of the high level information from the data source section is helpful context for users who might try to later connect to the data

On the Data Pane

  • Rename fields to use ‘friendly’ names, such as the common nomenclature for the field among the business users

  • Set the right data types

  • Add a comment to fields if your data source will be used for Ask Data or for business users who are less familiar with the data and/or Tableau

This will appear on hover in the data source pane and Ask Data on Server

  • If the Table names are enough context to group the fields, then that is fine, but if the data source has a lot of fields, using Folders can be useful


  • Having a naming convention that makes it clear when LODs or Parameters are being used can be very helpful, but can also sometimes be less friendly with displaying the field names on views

Did you know the field descriptions are searchable? Yep, you can come up with a tag system and include it in the description, and search right in the Tableau Desktop data pane. Field descriptions are also visible on Ask Data.

  • In addition to the items above, calculations can be commented in the calculation window, just like any other type of code

  • When you’re ready to publish, cleanup…

    • Delete calculations you ended up not using, copies of fields, etc.

    • Hide all unused fields

    • Hide fields that aren’t meant to be used (such as id fields that you need, but don’t mean anything to the user, or base fields that were replaced with LOD calculations). If they can’t be hidden, putting them in a folder labeled “Do Not Use” is also helpful. If it will mess up someone’s analysis to use that field, hide it.

Sheets

  • Name the sheets descriptively, with leading names that help identify the section, dashboard, etc.

  • Color coding your tabs can be very helpful. People use the colors for different things, but I like to use it to show when certain filters will apply

  • The reason I like to use the colors to show filters, is because when changing filter settings to apply to specific sheets, you can see these colors, making it much easier to select the right sheets for the right filters

  • If you aren’t using the captions for display on a dashboard, you can use those to add notes on how a more complicated sheet is working

  • You can include a sheet with a “Read Me” for developers, containing data source or workbook level information. This sheet doesn’t get published, but can contain a wealth of knowledge

Dashboards

  • Layout containers are awesome. Use layout containers! But, really, containers will help a lot with development, layout, save you from floating many items, and help organize

  • Name the containers so you can identify them in the layout pane. This has saved me on complicated dashboards, and is definitely worth the time it takes to do it.

  • Include clear chart headings, axis headings, and helper text so the user knows what they are looking at, and have answers to any logic questions

  • When you’re done, “Hide all sheets” will clean up your workbook. Delete any unused sheets that you don’t need to keep for a reason.

For The End User


So far, the types of documentation I’ve covered are for developers (or yourself). But, whether you create functional user documentation or not, having documentation baked into the dashboard will be appreciated by the end user. For some users, it’s the only type of documentation they ever even see.

  • Tool tips can contain descriptions of what the metrics mean, text indicating what actions are available, and more. Don’t neglect tooltips!

  • Titles, labels and helper text are types of text that are displayed directly on the dashboard, and are important. These are things like clear axis labels, text describing interactivity, color legends, descriptive titles, and so on.


  • Overlays can be helpful for complicated dashboards with a lot of interactivity, where the visible helper text would be redundant, or just too much.

  • Include in the header or footer of the dashboard things like:

    • Data refresh date

    • Date range included, if different from the refresh date

    • Business points of contact

    • Developer point of contact

And now that I have thoroughly talked about one of the most tedious parts of development, go forth and do good data!

Comments


bottom of page