MultiQC plotting functions are held within multiqc.plots submodules.
To use them, simply import the modules you want, e.g.:
Once you’ve done that, you will have access to the corresponding plotting
functions:
These have been designed to work in a similar manner to each other - you
pass a data structure to them, along with optional extras such as categories
and configuration options, and they return a string of HTML to add to the
report. You can add this to the module introduction or sections as described
above. For example:
Common options
All plots should as a minimum have a config with an id and a title.
MultiQC is written to work with sensible defaults, so won’t complain if you
don’t supply these, but it’s good practice for usability (the ID is used as
a filename when exporting plots, and all plots should have a title when exported).
Plot titles should use the format Module name: Plot name (this is partly for
ease of use within MegaQC and other downstream tools).
Bar graphs
Simple data can be plotted in bar graphs. Many MultiQC modules make use
of stacked bar graphs. Here, the bargraph.plot() function comes to
the rescue. A basic example is as follows:
To specify the order of categories in the plot, you can supply a list of
dictionary keys. This can also be used to exclude a key from the plot.
If cats is given as a dict instead of a list, you can specify a nice name
and a colour too:
Finally, a third variable should be supplied with configuration variables for
the plot. The defaults are as follows:
Note
The keys id and title should always be passed as a minimum.
The id is used for the plot name when exporting.
If left unset the Plot Export panel will call the filename
mqc_hcplot_gtucwirdzx.png (with some other random string).
Plots should always have titles, especially as they can stand by themselves
when exported. The title should have the format Modulename: Plot Name
Switching datasets
It’s possible to have single plot with buttons to switch between different
datasets. To do this, give a list of data objects to the plot function
and specify the data_labels config option with the text to be used for the buttons:
You can also customise any plot configuration per-dataset, for example,
the y-axis label, min/max values, or title:
If supplying multiple datasets, you can also supply a list of category
objects. Make sure that they are in the same order as the data.
Categories should contain data keys, so if you’re supplying a list of two datasets,
you should supply a list of two sets of keys for the categories. MultiQC will try to
guess categories from the data keys if categories are missing.
For example, with two datasets supplied as above:
Or with additional customisation such as name and colour:
Note that, as in this example, the plot data can be the same dictionary supplied twice.
Line graphs
This base function works much like the above, but for two-dimensional
data, to produce line graphs. It expects a dictionary with sample identifiers,
each containing numeric x:y points. For example:
Additionally, a configuration dict can be supplied. The defaults are as follows:
Note
The keys id and title should always be passed as a minimum.
The id is used for the plot name when exporting.
If left unset the Plot Export panel will call the filename
mqc_hcplot_gtucwirdzx.png (with some other random string).
Plots should always have titles, especially as they can stand by themselves
when exported. The title should have the format Modulename: Plot Name
X-axis format
Plotly will try to automatically parse the X-axis values. Strings that look like a
number will be interpreted as numbers (e.g. "13" and "2.0" will turn into 13 and 2.0
and get ordered numerically: 2.0, 13); dates in ISO format will be parsed as datestamps
(e.g. "2021-01-01" will turn into a datetime object and ordered chronologically).
If you want to force the X-axis to be treated as plain strings, set categories=True in the plot config.
Switching datasets
You can also have a single plot with buttons to switch between different
datasets. To do this, just supply a list of data dicts instead (same
formats as described above). For example:
You’ll also want to add the following configuration options to
give names to the buttons and graph labels:
All of these config values are optional, the function will default
to sensible values if things are missing.
Additional data series
Sometimes, it’s good to be able to specify specific data series manually.
To do this, use config['extra_series']. For a single extra line this can
be a dict (as below). For multiple lines, use a list of dicts. For multiple
dataset plots, use a list of list of dicts.
For example, to add a dotted x = y reference line:
Box plots
Box plots take similar data structure as line plots, but better visualize the
underlying data distribution by emphasizing quartiles, mean, median, standard
deviation, the extreme values and the outliers.
Instead of x
pairs, the box plot take a flat list of points for each sample:
Similarly to other plot types, multiple datasets can be passed as data, along with
dataset-specific configurations provided with the pconfig["data_labels"] option.
Scatter Plots
Scatter plots work in almost exactly the same way as line plots. Most (if not all)
config options are shared between the two. The data structure is similar but not identical:
Note that you must use the keys x and y for each data point.
If you want more than one data point per sample, you can supply a list of
dictionaries instead. You can also optionally specify point colours and
sample name suffixes (these are appended to the sample name):
Remember that MultiQC reports can contain large numbers of samples, so this plot type
is not suitable for large quantities of data - 20,000 genes might look good
for one sample, but when someone runs MultiQC with 500 samples, it will crash
the browser and be impossible to interpret.
See the documentation about line plots for most config options. The scatter plot
has a handful of unique ones in addition:
Creating a table
Tables should work just like the functions above (most like the bar
graph function). As a minimum, the function takes a dictionary containing
data - the first keys will be sample names (row headers) and each key
contained within will be a table column header.
You can also supply a list of key names to restrict the data in the table
to certain keys / columns. This also specifies the order that columns
should be displayed in.
For more customisation, the headers can be supplied as a dictionary. Each
key should match the keys used in the data dictionary, but values can
customise the output.
Finally, the function accepts a config dictionary as a third parameter.
This can set global options for the table (e.g. a title) and can also hold
default values to customise the output of all table columns.
The default header keys are:
A third parameter can be specified with settings for the whole table:
Most of the header keys can also be specified in the table config
(namespace, scale, format, colour, hidden, max, min, ceiling, floor, minrange, shared_key, modify).
These will then be applied to all columns prior to applying column-specific heading config.
A very basic example of creating a table is shown below:
A more complicated version with ordered columns, defaults and column-specific
settings (e.g. no decimal places):
Table decimal places
You can customise how many decimal places a number has by using the format config
key for that column. The default format string is "{:,.1f}", which specifies a
float number with a single decimal place. To remove decimals use "{:,d}".
To have two decimal places, use "{:,.2f}".
Table colour scales
Colour scales are taken from ColorBrewer2.
Colour scales can be reversed by adding the suffix -rev to the name. For example, RdYlGn-rev.
The following scales are available:
Custom cell background colours
You can specify custom background colours for specific values using the bgcols
header config. This takes precedence over scale.
For example, a header config for a column could look like this:
Zero centrepoints
If you set the header config bars_zero_centrepoint to True, the background bars
will use the absolute values to calculate bar width. So a value of 0 will have a bar
width of 0, 20 a width of 20 and -30 a width of 30.
This works well with a divergent colour-scheme as the bar width shows the magnitude
of the value properly, whilst the colour scheme shows the difference between positive
and negative values.
For example:
Conditional formatting of data values
MultiQC has configuration options to allow users to configure
“Conditional formatting”,
with highlighted values in table cells.
Developers can also make use of this functionality within the header config dictionaries
for formatting data values.
The functionality follows the same logic as for user configs with the parameters
cond_formatting_rules and cond_formatting_colours. These correspond to the
user config options table_cond_formatting_rules and table_cond_formatting_colours,
with the exception that no column ID is needed for table_cond_formatting_rules.
For example, a simple header config could look as follows:
A more complex version with multiple rules could be:
Specifying sorting of columns
By default, each table is sorted by sample name alphabetically. You can override the
sorting order using the defaultsort option. Here is an example:
In this case, the general stats table will be sorted by “Mean Insert Length” first,
in ascending order, then by “Starting Amount (ng)”, in descending (default) order. The
table with the ID quast_table (which you can find by clicking the “Configure Columns”
button above the table in the report) will be sorted by “Largest contig”.
Violin plots
Violin plots work from the exact same data structure as tables, so the
usage is just the same. Moreover, a for every table, a switch button is available
to view a corresponding violin plot for the underlying data.
The function also accepts the same headers and config parameters.
Heatmaps
Heatmaps expect data in the structure of a list of lists. Then, a list
of sample names for the x-axis, and optionally for the y-axis (defaults
to the same as the x-axis).
A simple example:
Alternatively you can supply a dictionary of dictionaries, in which case
xcats and ycats are optional:
Much like the other plots, you can change the way that the heatmap looks
using a config dictionary:
The colour stops are a bit special and can be used to define a custom colour
scheme. These should be defined as a list of lists, with a number between 0 and 1
and a HTML colour. The default is RdYlBu from ColorBrewer:
Interactive / Flat image plots
Note that the all plotting functions except for table can generate both interactive
JavaScript-powered report plots and flat image plots. This choice is made
depending on the presence of the --flat (config.plots_flat) flag.
Note that both plot types should come out looking pretty much identical. If
you spot something that’s missing in the flat image plots, let me know.
This website uses cookies to offer you a better browsing experience. Find out more on
how we use cookies.