The top menu bar allows the user to switch between the various views of Oscar and save your storydashboards. Also on this bar is the data workshop tab which allows users with the data scientist edition to use the most advanced features.

To switch between editing your storydashboard and presenting click on the view/edit tab. When in edit mode a side bar menu gives the user access to Oscar's various charts. In view mode users can intereact and filter their storydashboards to show off their data and insights.

Infographics & Shapes

Oscar allows users to create engaging narratives using our storydashboards. Afterall we absorb more information through stories than just charts. The infographic features can be used to provide context, extra information and colour to the dashboard to provide a more compelling platform for your insights. There are 3 types of infographic available in Oscar, text, picture and video.

Below are some examples of the kinds of infographics users can deploy in Oscar:

Text

To create a text based infographic select ‘Text’ from the side menu bar. Right click and choose edit on the new graphic that has appeared on your storydashboard. In the new window you have a host of options to customise your infographics. These range from fonts, colours, opacity to inserting statistical models.

KPI Functions

With Oscar’s text infographic features you can add basic statistical models to your storydashboard. These include average, sum, min, max, count, percent and standard deviation. Select which dataset and any features you want to use. Choose one of the aforementioned models from the drop down and type in a numerical variable under column. Click insert to produce the statistical function in the previous window where you can edit the colours, fonts and more as with any other piece of text.

Picture

To create an image based infographic select ‘infographic’ from the side menu bar and click ‘Image’ in the pop up. Your options here are to upload an image from your device or one already in Oscar by navigating your drives. Find the image you want and then select insert.

Video

To create a video based infographic select ‘infographic’ from the side menu bar and click ‘Video’ in the pop up. Your options here are to upload a video from your device or one already in Oscar by navigating your drives. Find the video you want and then select insert.

Filters

Filters can be created, for a specific dataset, to filter out unwanted rows. This filter can then be dynamically applied to charts in the dashboard.

Multiple filters can be created for each dataset; the filter that the chart will use is one of the options when creating the chart and can also be changed by selecting ‘Choose filter’ on the right-click menu available in edit mode.

Figure 1 The right-click menu with the ‘Choose filter’ option highlighted.

Figure 2 The modal to choose a filter for the specific chart (accessed by selecting the ‘Choose filter’ option of the right-click menu.

There are three filter methods available when creating filters: IN, BETWEEN and ISNULL. BETWEEN is for numerical fields, it takes an upper and lower bound, can be inclusive or exclusive of these bounds and can be negated (i.e. NOT BETWEEN). IN is for categorical and discrete numeric fields, it takes field instances and can be inclusive or exclusive (i.e. NOT IN). By default, it is set to exclusive and will filter out rows with where the field instance is equal to the any of the defined instances. Finally, ISNULL can be used to remove rows with null values for a particular field. Each of these methods defines a node within the overall filter and these nodes are connected using AND and/or OR nodes.

There are two ways that a filter can be created and edited: using the charts in view mode and using the filter GUI available on edit mode.

Multiple filters can be created for each dataset; the filter that the chart will use is one of the options when creating the chart and can also be changed by selecting ‘Choose filter’ on the right-click menu available in edit mode.

Color workshop

Colour workshop is used to manage the colours used in the charts. Open the colour workshop by selecting the ‘Colours’ button on the left-hand side of the dashboard in edit mode will display a list of all the datasets loaded into the dashboard as well as the charts on the dashboard arranged by the dataset that they are using.

The default pallet can be chosen for each dataset, this is the pallet that will be applied to any newly created charts using that dataset.

Each chart entry contains the chart type, the field that is being used to determine the colour and the colour pallet. On most charts, it is possible to change the field used for colouring on this page and by selecting the colour pallet it is possible to modify the colours used by this chart. Some charts, however do not have any colouring options available (e.g. regression and classification).

Selecting the colour pallet will bring up the ‘Edit Colours’ modal, this will show the current colours being used and allows these to be modified. The options available through this modal depend on whether the field being used to colour is either categorical or numerical.

With categorical fields, it is possible to choose each colour individually, creating a custom pallet, or to change to a different predefined pallet. The top combo box is used to change between predefined pallets but for more control, it is possible to select the coloured box directly which will bring up the ‘Colour Picker’ modal. Using this it is possible to select any of the colours in any of the predefined pallets or to select a custom colour either through using the custom option or by adjusting the RGB values directly. Using this method will create a ‘Custom’ pallet which is available in the ‘Edit Colours’ modal combo box. Please note that it is only possible to have one custom pallet for each field in a dataset, if another chart uses the same dataset and field, creating a custom pallet for this chart will change the custom pallet for the original chart.

Figure 1 The colour picker modal used to create a custom colour pallet.

With numerical fields, it is possible to modify the gradient by either selecting the ‘Edit Gradient’ button or by selecting one of the coloured boxes. This brings up a separate modal that allows the gradient to be customised. It is possible to create a two-colour or three-colour gradient using this modal, for both the colours used to represent the upper and lower bounds can be selected as can the upper and lower bounds themselves, by default these are the maximum and minimum values of the field respectively. The number of bins can also be defined using this modal, this represents the number of distinct colours that will be used. Figure 3 shows an example 2-colour gradient.

Figure 2 The edit gradient modal being used to create a 2-colour numeric colour gradient.

Figure 3 An example 2-colour gradient with 100 as the upper bound, 0 as the lower and 5 bins. Note that any value above 80 will be represented using the upper bound colour and any value below 20 will be represented using the lower bound colour.

Selecting the 3-colour option adds a central colour, by default this is the central value of the range (i.e. (upper bound – lower bound) / 2) but this can be changed to a user-defined value.

Figure 4 The edit gradient modal being used to create a 3-colour numeric colour gradient.

Charts

Oscar is code-free so that all users can find the insights that keep the business moving forward. All the charts listed below can be deployed by users with no data science background. Users only need to do have an understanding of what is inside their dataset.

Pie Chart

To create a pie chart in Oscar first select the pie chart icon from the side menu bar. From here you choose the dataset you want the visualisation to draw from and add any filters. Pie Charts use categorical data, begin to type the category you’re interested in and select it from the drop down, choose whether you want the values as the sum or count. If you choose sum then you will need to choose a numeric field and Oscar will give you a sum of that field for each category. Finish by clicking create pie chart.

Right click on the pie chart to edit the settings, these including converting to a doughnut chart by editing the inner radius. You can also view the data in the chart as a spreadsheet.

Bar/Line Chart

To create a bar/line chart in Oscar first select the bar/line chart icon from the side menu bar. From here you choose the dataset you want the visualisation to draw from and add any filters. First begin to type the variable that interests you in the X-axis box, bar/line charts can use categorical or numerical data, then select it from the drop down. Then do the same for the Y-axis. For the Y-axis you must also choose whether you want the values as the sum, count, average, min or max. Finish by clicking create Bar/Line.

Right click on the chart to edit the settings, such as the labels. You can also view the data in the chart as a spreadsheet.

Slope Bar

To create a Slope bar chart select the ‘slope bar chart’ icon from the side menu bar. From here you choose the dataset you want the visualisation to draw from and add any filters. Type the variable that interests you in the X-axis box and select from the drop down, this chart can use categorical or numerical data. Then do the same for the Y-axis.

Right click on the chart to edit the settings, such as the labels. You can also view the data in the chart as a spreadsheet.

Geographic Chart

To create a Geographic chart select the ‘Geographic chart’ icon from the side menu bar. From here you choose the dataset you want the visualisation to draw from and add any filters. Next choose your target variable and whether you want the data displayed as a count, average or sum, it must be numerical. The next step is selecting the categorical value you want your chart to be coloured by.

After this you need to set the location fields so that the map can be accurate. This must include latitude and longtitude as well as a location field e.g. city. Finally you need to select map you want your data transposed on, US states, India States, Germany States, UK post codes or World countries.

Right click on the chart to edit the settings.

Spreadsheet

To visualise your data in an excel like table selectthe icon from the side bar and pick your dataset and filters. Click ‘Create Spreadsheet’ to generate your table. Edit the size of your table to show more columns and data or simply scroll in view mode.

Scatter Plot

A scatterplot is a chart that plots two numeric variables against each other. A third dimension can be added to the plot via colour coding with a categorical variable.

Creating a scatterplot in Oscar is simple – enter a numeric field for the x and y-axis and a categorical variable in order to colour code the data points (the categorical field must be filled). A fourth dimension can be added to the chart by ticking the z-axis box and adding another numeric column - the area of the data point represents that variable’s value.

To edit, enter edit mode and right click to bring up the options

Chord Chart

The Chord Chart can be run using categorical and numerical values.

To create a Chord chart click on the “Chord Chart” icon in the side bar menu. In the new window select which dataset and which filter you want to use from the drop downs. After selecting the dataset and the filter, you must choose the two variables they want to display.

If a categorical variable is selected, you need to decide if you want to normalise the variables, by default they are not normalised, and the variable limit. Normalising scales the variables and makes the chords the same width, it is not recommended that you normalise both variables. The variable limit changes the maximum number of categories Oscar will display. Oscar will combine the remaining and smallest categories into one. By default, the limit is 30.

Additionally, if the selected variable is numerical, the user will also need to tell Oscar the same information as a categorical value as well as bins and slide information. Bins information defines the shape of the bins. This information can be given as an integer or a list. If it is an integer it will be the number of bins and if it is a list it will be the shape of the bin and the size of the list will be the number of bins. For instance, [1,1,2] on range (0,10) will result in bin edges [0, 2.5, 5, 10]. By default, this number is 10. Slide is a boolean value that by default is false and means that Oscar ensures the combined bins are consecutive rather than simply choosing the smallest counts for each bin, this prevents bins merging from all over the range.

Spider Chart

The spider Chart can be used to display the value of a specific aggregation over three or more numeric variables related to a categorical variable. To create a Spider chart, click on the “Spider Chart” icon in the side bar menu.

Next the user needs to select a categorical variable, a type of aggregation (count, average, sum, maximum, minimum) and three or more numerical variables. Numerical variables are the axes of the spider chart. Oscar will display the aggregated type for each axis related to the categorical variable.

The first thing the user needs to do is choose which dataset is going to be used and which filter it needs to apply from the drop downs. After selecting the dataset and the filter, choose a categorical variable which will be used for the aggregation and the aggregation type they want to use. The axes of the Spider Chart are the numerical variables that appear as “Inputs fields”. There must be at least 3 variables to create a spider.

Next type the numerical variables inside the input fields, it is possible to add and remove as many axes as the you want by clicking on the plus or minus icon. As stated before there must be at least 3.

When all this data is selected and “Create Spider” is clicked it will appear the Spider Chart. You can edit the chart by right clicking or by using the spanner icon while the chart is selected in edit mode.

Paragram

The Paragram chart can be used for displaying the data of two or more numerical variables against one categorical variable. Paragrams allow users to create and run filters easily, selecting the desired values of the axes by clicking and dragging across the axes or by inserting the value inside the boxes under each axis.

To begin click on the paragram icon in the side bar menu. Once the icon is clicked a new window appears for selecting the information needed to run the paragram. The first thing Oscar needs to be told is which dataset is going to be used and which filter should be applied. This information can be selected from the two drop down menus. After selecting the dataset and the filter, users must choose a categorical variable which it will be used for colouring all the samples. After choosing the categorical value the next step is selecting the axes of the paragram. Axes in the paragram are numerical variables and at least two axes are needed to create a paragram inside Oscar.

The order of the numerical can be changed with the up/down arrows and it is possible to add as many axes as the user wants by clicking on the plus icon.

When all this data is selected and “Create Paragram” is clicked it will appear the paragram. You can edit the chart by right clicking or using the spanner icon in edit mode.

Regression

When creating a regression chart the dataset is used to train a linear regression model for the target variable. This is essentially a line of best fit for the target based on the other given fields and can be used in Data Workshop to perform predictions on similar datasets using the Apply Model feature.

To create the regression chart, the first step is to choose a name for the new column; this is used when creating a new dataset from the model in Data Workshop. The next field is the target; this is the field that you wish to create the prediction on. Finally, the following fields are those that will be used to create the model, these should be the fields that the user feels have the most impact on the target variable.

Three linear regression models are created from the data: least squares, lasso and ridge regression. The results of each of these will be displayed and a recommendation will be made as to the most suitable of the three. The user is then able to choose which model to use as the final output. The chart produced shows the level of dependency of each of the chosen fields on the target; this is essentially the coefficient of the regression equation normalised between –1 and 1. Similar to the other charts, once created, right-clicking on the chart will give options to edit the information used to create the chart and the final visualisation.

Timeseries

A time series is a series of data points that are identified at regular intervals and therefore tied to this timestamp. A time series model aims to explain the underlying patterns in this series, this model is then typically used to make predictions for future time periods.

In order to create the model in Oscar, first select the target variable, then choose the aggregation method, this is used to group the target variable by a particular timescale. Then select the date field and identify whether the data has seasonal fluctuations so that the correct model (arima or sarimax) can be chosen. The next field is the timescale; this is used to group the target variable using the aggregation method chosen earlier. If the data is seasonal, periodicity will be the next visible field, this identifies the number of aggregated values, or time periods, that denote a season (i.e. 3 months in a quarter). Finally, the forward and historical horizons must be chosen, the historical horizon is the number of time periods that are used to train the model which is then used to predict ahead by the forward horizon.

The final chart will also denote the confidence of this prediction by giving a colour coded range. Similar to the other charts, once created, right-clicking on the chart will give options to edit the information used to create the chart and the final visualisation. In addition to the chart a new dataset will be created containing the aggregated data.

Distribution

A distribution chart is a chart that plots two variables against each other. A third dimension can be added to the plot via colour coding with a categorical variable. The variables can be either categorical or numerical. If a numerical variable is used then it is discretised (turned into bins e.g. 0-10)

When only two numeric variables are included in the plot then it resembles a heatmap. The x and y variables (if numeric) are discretised and each box is coloured a certain shade. The shade shows the number of observations in each bucket.

When a 3rd variable (Colour By) is used each box is divided up. The size of the colour by boxes shows the proportion of each category within that box.

In order to create a distribution chart, click on the distribution icon under the Charts tab. Add a variable for X-axis value and Y-axis value. Then either add a 3rd Colour By variable or click create distribution. You can edit the chart by right clicking or using the spanner icon in edit mode.

Clustering

Clustering tries to segment data into a given number of clusters based on the similarity between points. For this the KMeans algorithm is used, KMeans randomly assigns the data points to one of k clusters, then each data point is assigned to a new cluster to minimise the within cluster sum of squares distance.

The clustering modal differs from most of those used by the other charts as the information is split into three tabs: data, painting and model. In addition, each of the fields is pre-populated so that the user can create a clustering model without any further input, although this is unlikely to produce meaningful results.

On the data tab, the number of clusters can be chosen as can the fields that will be analysed by the model; this can only contain numerical fields and is pre-populated with all fields chosen. On the painting tab, five fields can be chosen for which the data will be visible for each point on the chart; this can be seen by hovering over a point while in ‘View’ mode. On the model tab the output column name and output dataset names can be chosen; the clustering process will create a new dataset, based off of the original dataset, containing the five paint points selected and an additional column denoting the cluster to which the row belongs.

Unlike most of the other charts, the information used to create it cannot be edited once the chart is created. The visualisation, however, can still be edited by right-clicking on the chart, similar to the other charts.

Classification

Classification is the process of categorising new data points, on the basis of a training set of data where the classes are already known - this is done by training a model on a dataset. Once the model is trained it can used to predict classes of new data points with the same features.

To create a classification model in Oscar, click on the classification icon under the Charts tab. Then select a categorical variable as the response and several variables as features. Oscar uses multinomial logistic regression for classification. It creates the model by fitting K-1 independent logistic regression models, where K is the number of categories in the response column. A graph is then generated displaying the variable importance of each feature included in the model, along with the precision, recall and accuracy of each one.

Cognition

In Oscar the precognition chart fits a decision tree model to a selected dataset. In order to fit one a target column must be supplied. Columns can be excluded from the fit by clicking on them in the “Analyse these fields” so they move to the “Do not use these fields” column (by default all columns bar the target one are included in the fit).

Oscar outputs an interactive decision tree graphic. By clicking on the nodes, the distribution of the response variable by the split is shown, along with the percentage of data at that node. By clicking on the bottom nodes, the tree expands along that branch.

The decision tree model can then be used to predict the outcome of a different dataset with the same schema in data workshop, by using the apply model process.

Timeplot

The timeplot is a chart that plots a numeric variable against a date variable. The date variable can be aggregated up to different time frames, so can the numeric variable by different statistics like Count.

In order to create a distribution chart click timeplot under charts. Enter a data field for the X axis and a numeric field for the Y axis. You can edit the chart by right clicking or using the spanner icon in edit mode.

Series Slider

The series slider allows you to easily create new filters inside the dashboard. Choose a column or numberic variable, then select a range and the series slider creates a filter that feeds into the cross filtering.