Tableau Data Prep

 admin

Note: Data source owners and Tableau administrators can add synonyms for specific data field names and values for Ask Data. For information about using data roles for Ask Data, see Add Synonyms for Ask Data(Link opens in a new window) in the Tableau Desktop help.

  1. Tableau Data Prep Download
  2. Download Tableau Prep
  3. Tableau Data Prep Conductor
  4. Tableau Data Prep Vs Alteryx
  5. Tableau Data Prep

Working in Tableau Prep Open Tableau Prep and use two Input steps to bring in the Timesheet and Calendar Scaffold data sources. Next, use a Join step to join the Timesheet data source to the calendar scaffold on Date. This join will bring the Pay Period numbers into the Timesheet data. In order to do that, let’s open Tableau Prep and connect the Excel file. Once you have it connected the file, select cleaned with data interpreter. You will notice that the Tableau Prep has suggested 2 tables to you after an automated analysis of the data in the Excel file. Select the second table on the bottom.

Tableau Prep Builder is a tool in the Tableau product suite designed to make preparing your data easy and intuitive. Use Tableau Prep Builder to combine, shape, and clean your data for analysis in Tableau. Note: Tableau Prep version 2019.1.2 had changed its name to Tableau Prep Builder and refers to the Desktop application. Similarly, Tableau Prep lets you combine an Oracle Table, a SQL Server table and a Microsoft Excel worksheet into one data source with just a couple of clicks. While some data prep can be done in Tableau Desktop’s data source tab, there are limitations to what can be done.

Use data roles to quickly identify whether the values in a field are valid or not. Tableau Prep delivers a standard set of data roles that you can select from or you can create your own using the unique field values in your data set.

When you assign a data role, Tableau Prep compares the standard values defined for the data role with the values in your field. Any values that don't match are marked with a red exclamation mark. You can filter your field to view only the valid or invalid values and take the appropriate actions to fix them. Once you've assigned a data role to your fields, you can use the Group Values option to group and match invalid values to valid ones based on spelling and pronunciation.

Note: Starting in version 2020.4.1, you can now create and edit flows in Tableau Server and Tableau Online. The content in this topic applies to all platforms, unless specifically noted. For more information about authoring flows on the web, see Tableau Prep on the Web.

Assign standard data roles to your data

Assign data roles provided by Tableau Prep to your field the same way you assign a data type. The data role identifies what your data values represent so Tableau Prep can automatically validate values and highlight ones that aren't valid for that role.

For example if you have field values for geographical data, you can assign a data role of City and Tableau Prep compares the values in the field to a set of known domain values to identify values that don't match.

Note: Each field is analyzed independently so a City value of 'Portland' in State 'Washington' in Country 'USA' might not be a valid city and state combination, but it won't be identified that way because it is a valid city name.

Tableau Prep Builder provides the following data roles:

  • Email

  • URL

  • Geographic roles (Based on current geographic data and is the same data used by Tableau Desktop)

    • Airport
    • Area code (U.S.)
    • CBSA/MSA
    • City
    • Congressional District (U.S.)
    • Country/Region
    • County
    • NUTS Europe
    • State/Province
    • Zip code/Postal code

Tip: In Tableau Prep Builder version 2019.1.4 and later and on the web, if you assign a geographic role to a field, you can also use that data role to match and group values with the standard value defined by your data role. For more information about grouping values using data roles, see Clean and Shape Data(Link opens in a new window).

To assign a data role to a field, do the following:

  1. In the Profile pane, Results pane or data grid, click the data type for the field.

  2. Select the data role for the field.

    Tableau Prep compares the field's data values to known domain values or patterns (for email or URL) for the data role you select and marks any values that don't match with a red exclamation point.

  3. Click the drop-down arrow for the field and from the Show Values section select an option to show all values or only values that are valid or not valid for the data role.

  4. Use the cleaning options on the More optionsmenu for the field to correct any values that aren't valid. For more information about how to clean your field values see About cleaning operations(Link opens in a new window).

Create custom data roles

Starting in Tableau Prep Builder version 2019.3.1 and on the web, you can create your own custom data roles using the field values in your data sets to create a standard set of values that you or others can then use to validate fields when cleaning data. Select the field that you want to use, apply any cleaning operations to it if needed, then, publish it to Tableau Server or Tableau Online to use it in your flow or share your data roles with others.

If creating custom data roles when editing flows on the web, you can publish the custom data role directly to the server you are signed into.

Requirements

  • You can create custom data roles from single fields in your data set. Creating custom data roles from a combination of fields isn't supported.
  • You can create custom data roles only for fields assigned to a data type of String and Number (whole).
  • When you create a custom data role, Tableau Prep creates an output step in your flow that is specific to publishing the data role.
  • Publishing custom data roles to multiple sites in the same flow isn't supported. If you publish the flow, you must publish the custom data role to the same site or server where the flow is published.
  • Custom data roles are specific to the site, server and project where you publish them. All users with permissions to the location can use the custom data role, but must be signed into the site or server to select it or apply it. Custom data roles are assigned the default permission for the All Users group for new projects instead of None.
  • Custom data roles aren't version specific. When applying a custom data role, the most current version is applied.
  • Once published to Tableau Server or Tableau Online user with access to the site, server and project can view all data roles in that location.
    • Users with appropriate permissions can move, delete or edit permissions for the data roles.
    • The permissions you can set and actions you can take on a custom data role are similar to what you can do with a flow. For more information, see Manage a Flow(Link opens in a new window). For more information on setting permissions, see Permission capabilities(Link opens in a new window) in the Tableau Server help.
  • To edit a data role, you must make your changes in Tableau Prep Builder or in the flow on the web, then republish the data role using the same name to overwrite it. This process is similar to editing a published data source.

Create a custom data role

  1. In the Profile pane, data grid, or Results pane select the field you want to use to create a custom data role.

  2. Click More options for the field, and select Publish as Data Role.

  3. Select the server and project where you want to publish the data role.

  4. Click Run Flow to create the data role. After the publishing process completes successfully, you can view your data role in Tableau Server or Tableau Online. Processing the data role can take some time based on the load on your Tableau Server or Tableau Online site. If your data role isn't available right away, wait a few minutes, then try selecting it again.

Apply a custom data role

  1. In the Profile pane, Results pane or data grid, click the data type for the field where you want to apply the custom data role.

  2. Select Custom then select the data role that you want to apply to the field.

    Important: In Tableau Prep Builder, make sure you are signed into the site or server where the data role was published or you won't see this option.

    Tableau Prep compares the field's data values to known domain values for the data role you select and marks any values that don't match with a red exclamation point.

  3. Click the drop-down arrow for the field and from the Show Values section select an option to show all values or only values that are valid or not valid for the data role.

  4. Use the cleaning options on the More optionsmenu for the field to correct any values that aren't valid. For more information about how to clean your field values see About cleaning operations(Link opens in a new window).

View and manage custom data roles

You can view and manage your published custom data roles on Tableau Server and Tableau Online. You can view all custom data roles published to your site or server. Click More actions for a selected data role to move it to a different project, change permissions or delete it.

Group similar values by data role

Note: In Tableau Prep Builder version 2019.1.4 and 2019.2.1 this option was labeled Data Role Matches.

If you assign a geographic data role to a field you can use the values in the data role to group and match values in your data field based on spelling and pronunciation to standardize them. You can use either Spelling or Spelling + Pronunciation to group and match invalid values to valid ones.

These options uses the standard value defined by the data role. If the standard value isn't in your data set sample, Tableau Prep adds it automatically and marks the value as not in the original data set. For more information about assigning data roles to fields, see Assign standard data roles to your data.

To use data roles to group values, complete the following steps.

  1. In the Profile pane, Results pane or data grid, click the data type for the field.

  2. Select one of the following data roles for the field:

    • Airport
    • City
    • Country/Region
    • County
    • State/Province

    Starting in Tableau Prep Builder version 2019.3.2 and on the web, you can also select from your custom data roles.

    Standard data roles (version 2019.1.4 and later)Custom data roles (version 2019.3.2 and later)

    Tableau Prep compares the field's data values to known domain values for the data role you select and marks any values that don't match with a red exclamation point.

  3. Click More options, select Group Values (Group and Replace in previous versions), then select one of the following options:

    • Spelling: Matches invalid values to the closest valid values that differ by adding, removing, or substituting characters.
    • Pronunciation + Spelling: Matches invalid values to the most similar valid value based on spelling and pronunciation.

    You can also click on the Recommendationsicon on the field to apply the recommendation to group and replace the invalid values with valid ones. This option uses the Pronunciation + Spelling Group Values option.

    Tableau Prep compares the values by spelling or spelling and pronunciation and then groups similar values under the standardized value for the data role. If the standardized value isn't in your data set, the value is added and marked with a red dot.

Thanks for your feedback!

Tableau Prep Builder is a tool in the Tableau product suite designed to make preparing your data easy and intuitive. Use Tableau Prep Builder to combine, shape, and clean your data for analysis in Tableau.

Note: Tableau Prep version 2019.1.2 had changed its name to Tableau Prep Builder and refers to the Desktop application. Starting in version 2020.4.1, you can now create and edit flows on the web. Tableau Prep on the web refers to creating or editing flows on Tableau Server or Tableau Online. For more information, see Tableau Prep on the Web.

Using Tableau Prep

Tableau

Start by connecting to your data from a variety of files, servers, or Tableau extracts. Connect to and combine data from multiple data sources. Drag and drop or double-click to bring your tables into the flow pane, and then add flow steps where you can then use familiar operations such as filter, split, rename, pivot, join, union and more to clean and shape your data.

Each step in the process is represented visually in a flow chart that you create and control. Tableau Prep tracks each operation so that you can check your work and make changes at any point in the flow.

When you are finished with your flow, run it to apply the operations to the entire data set.

Tableau Data Prep

Tableau Prep works seamlessly with other Tableau products. At any point in your flow, you can create an extract of your data, publish your data source to Tableau Server or Tableau Online, publish your flow to Tableau Server or Tableau Online to continue editing on the web or refresh your data using a schedule. You can also open Tableau Desktop directly from within Tableau Prep Builder to preview your data.

For information about installing Tableau Prep Builder, see Install Tableau Desktop or Tableau Prep Builder(Link opens in a new window) in the Tableau Desktop and Tableau Prep Deployment Guide.

See Tableau Prep Builder in action

Click the image to replay it.

Ready to try it out? From the Start page, click on one of the sample flows to explore and experiment with the steps, try the Get Started with Tableau Prep Builder(Link opens in a new window) hands-on tutorial to learn how to create a flow or try stepping through one of our Day in the Life Scenarios using Tableau Prep Builder.

Note: You can find the sample data files used in the flows in these locations:

  • (Windows) C:Program FilesTableauTableau Prep Builder <version>helpSamplesen_US
  • (Mac) /Applications/Tableau Prep Builder <version>.app/Contents/help/Samples/en_US

To learn more about how Tableau Prep Builder optimizes your data for performance, see Tableau Prep under the hood(Link opens in a new window). To learn more about Tableau Prep and the different features and functions it offers, review the topics in this guide.

A tour of the Tableau Prep workspace

Tableau Data Prep Download

The Tableau Prep workspace consists of the Connections pane (A) where you connect to your data sources, and three coordinated areas that help you interact with and explore your data:

  • Flow pane (B): A visual representation of your operation steps as you prepare your data. This is where you add steps to build your flow.

  • Profile pane (C): A summary of each field in your data sample. See the shape of your data and quickly find outliers and null values.

  • Data grid (D): The row level detail for your data.

After you connect to your data and begin building your flow, you add steps in the Flow pane. These steps function as a lens into the structure of your data, as well as a summary of operations that is applied to your data. Each step represents a different category of operations that you define, all as part of your flow.

See the Visual Dictionary(Link opens in a new window) for a look at the steps and icons used in Tableau Prep

Connections pane

On the left side of the workspace is the Connections pane, which shows the databases and files you are connected to. Add connections to one or more data sources and then drag the tables you want to work with into the Flow pane. For more information see Connect to Data(Link opens in a new window).

You can minimize the Connections pane if you need more room in your workspace.

Flow pane

At the top of the workspace is the Flow pane. This is where you'll build your flow. As you connect to, clean, shape, and combine your data, steps appear in the Flow pane and align from left to right along the top. These steps tell you what kind of operation is being applied, in what order, and how your data is affected by it. For example, the Join step shows you which join type you’ve applied, the join clauses, recommended join clauses, and the fields of the tables that are included in the join.

You start your flow by dragging tables into the Flow pane. Here you can add additional data sets, pivot your data, union or join data, create aggregations, and generate output files in the form of .tds files, Hyper extract (.hyper) files, or published data sources that you can use in Tableau. You can even write your output data to a database. For more information about generating output files, see Save and Share Your Work(Link opens in a new window).

Note: If you make changes to the data while in Tableau Desktop, for example renaming fields, changing data types, and so on, these changes aren’t written back to Tableau Prep Builder.

Profile pane

In the center of the workspace is the Profile pane. The Profile pane shows you the structure of your data at any point in the flow. The structure of your data can be represented in different ways depending on the operation you want to perform on your data or the step that you select in the Flow pane.

At the top of the Profile pane is a toolbar that shows you the cleaning operations that you can perform for each step in your flow. An options menu also appears on each card in the Profile pane where you can select the different operations that you can perform on the data.

For example:

  • Search, sort, and split fields

  • Filter, include, or exclude values

  • Find and fix null values

  • Rename fields

  • Clean up data entry errors using group values or quick cleaning operations

  • Use automatic data parse to change data types

  • Rearrange the order of your field columns by dragging and dropping them where you want them

Select one or more field values in a Profile card and right-click or Ctrl-click (MacOS) to see additional options to keep or exclude values, group selected values or replace values with Null.

Tableau Prep keeps track of any changes you make, in the order you make them, so you can always go back and review or edit those changes if needed. Use drag and drop to re-order the operations in the list to experiment and apply changes in a different order.

Download Tableau Prep

Click the arrow on the upper right of the pane to expand and collapse the Changes pane for more room to work with the data in the Profile pane.

For more information about applying cleaning operations to your data see Clean and Shape Data(Link opens in a new window).

Tableau Data Prep Conductor

Data grid

At the bottom of the workspace is the Data grid, which shows you the row level detail in your data. The values displayed in the Data grid reflect the operations defined in the Profile pane. You can perform the same cleaning operations here as you can in the Profile pane if you prefer to work at a more detailed level.

Click the Collapse Profiles icon on the toolbar to collapse (and expand) the Profile pane to see your options.

How Tableau Prep stores your data

When you connect Tableau Prep to your data and create a flow, it stores the frequently used data in a .hyper file. For large data sets, this might be a sample of the data. Any stored data is saved under a secure temporary file directory in a file named Prep BuilderXXXXX, where XXXXX represents a universally unique identifier (UUID). After you save the flow, the file is deleted. For more information about how Tableau Prep samples your data, see Set your data sample size(Link opens in a new window).

Tableau Data Prep Vs Alteryx

Tableau Prep Builder also saves data in the Tableau flow (.tfl) file to support the following operations (which can capture entered data values):

Tableau Data Prep

  • Custom SQL used in Input steps

  • Filtering (on data entry)

  • Group Values (on data entry)

  • Calculations

Thanks for your feedback!