So, You Want to Create Amazing Data Visualizations?
Look, we get it - staring at spreadsheets full of numbers is about as exciting as watching paint dry. But what if you could turn those boring numbers into eye-catching charts that actually make sense? When you think of Data Visualisations, you think of Tableau and maybe PowerBI, but Python.
When to Use Python Libraries for Data Visualisations
While it is always helpful to know how to visualise data in Python, there are some things to consider
- Learning curve: Python is a scripted language, like R. Tableau, PowerBI, Qlik and the like use GUI which makes it easier to plot graphs.
- R vs Python: R, like Python, is a scripted language. R is widely favoured for advanced statistical analysis and visualisation. Quite simply, it’s better for the specialist. Python, however, is far more versatile than R. For most people, Python will suffice, but R is currently the gold standard for statistical analysis and visualisation.
Benefits of Python over Tableau and PowerBI
While Tableau and PowerBI might be simple and more intuitive, being a scripted language has its perks:
- Reproducibility: Python involves writing code, which instructs the interpreter to follow it. Consider it to be like writing a recipe. The recipe can be followed by any Python interpreter anywhere.
- Customisable: Python is extremely flexible, and can create virtually any kind of visualisation. Libraries like Matplotlib and Plotly among countless others an be used to design custom plots, interactive dashboards, and even animations.
- Integration with other tools: Python can integrate with various libraries (e.g., for machine learning, web scraping) and handle more advanced data manipulations before visualising.
- Data Handling, Manipulation and Processing: Python can do significantly more than any GUI data visualisation tool, using libraries like Pandas, Numpy and Dask.
Key Python Libraries for Data Visualisation
Python has thousands of libraries, but for most people, you only need to know a few for data visualisation:
1. Matplotlib – The “Make it Work” library

Matplotlib is a fundamental library for data visualization in Python. It provides a wide range of tools for creating static, animated, and interactive visualizations.
Key points:
- Matplotlib provides a low-level interface that allows fine-grained control of every aspect of data visualisation.
- Matplotlib allows you to modify every detail of your plots, including axes, ticks, labels, colours, styles, and more.
- Matplotlib is used as a foundation for many Python libraries, including Seaborn. It also provides a rich set of tools and APIs that can be extended and built upon.
- Matplotlib is widely used in academic and scientific communities for creating publication-quality figures.
- Matplotlib integrates well with other scientific computing libraries in Python, such as NumPy and SciPy, making it a convenient choice in many data science and scientific computing contexts.
- Matplotlib is highly customisable, but requires more code for complex visualisations.
- Matplotlib is also the most widely used and documented visualisation library in Python, which it’s more suitable for beginners as there are a lot more resources.
Limitations:
- Matplotlib only supports basic interactivity in visuals i.e zooming and panning.
- Matplotlib operates at a relatively low level, requiring more code to create complex visualizations. Libraries like Seaborn provide higher-level abstractions for creating aesthetically pleasing plots with less effort.
- Matplotlib is not designed for creating web-based, interactive visualizations. Tools like D3.js, Plotly, or Bokeh are better suited for this purpose.
- When working with large datasets, Matplotlib can experience performance degradation. Rendering complex plots with thousands of data points can lead to slow performance and lag, impacting the user experience. For large-scale visualization tasks, other libraries like Datashader might be more suitable.
TLDR
Matplotlib is a Python library which offers highly customisable visualisations. However, the flip-side of this is a steep learning curve. It should still be the first library you learn in Python for visualisation.
2. Seaborn – The “Make it Pretty” library

Seaborn is a Python data visualisation library built on top of Matplotlib.
Key points:
- Unlike Matplotlib, Seaborn is a high-level plotting library. While Matplotlib requires you to manually define every aspect of a plot, Seaborn does most of the work for you and simplifies the code.
- Seaborn comes with built-in themes and colour palettes that make your plot look polished and professional right out of the box.
- Seaborn also has built-in statistical capabilities, like confidence intervals, fitting regression lines and distribution visualisation, which Matplotlib doesn’t offer natively.
- In addition to these, Seaborn provides specialised plots that aren’t directly available in Matplotlib, such as heatmaps, pairwise plots, violin plots and swarm plots.
- Seaborn automatically handles many aspects of data visualisation such as
- Aggregating data for bar plots
- Handling missing values gracefully
- Automatically scaling axes and adjusting plots for better readability.
- Seaborn provides a variety of built-in colour palettes and tools for creating custom palettes.
Limitations:
- As a consequence of being a high-level visualisation library, Seaborn doesn’t offer the same degree of customisation that Matplotlib does. A workaround for this is to use both libraries together.
- Seaborn lacks some niche plots, like 3D plots, polar plots or network graphs.
- Like Matplotlib, Seaborn can struggle with very large datasets. To combat this, use sampling or preprocessing to reduce the dataset if necessary.
- Seaborn is designed for static plots and offers no interactivity whatsoever, not even zooming and panning.
- Understanding how to fully leverage Seaborn requires knowledge of Pandas and Matplotlib.
TLDR
Seaborn is a high-level visualisation library designed for quick and simple, yet beautiful plots. However, it doesn’t offer the level of customisation that Matplotlib provides, and has no interactive features.
3. Plotly – The “Make it Move” library

Plotly is the go-to for creating interactive, web-ready visualisations with minimal effort, bridging the gap between static plotting libraries like Matplotlib and fully interactive tools like PowerBI and Tableau.
Key points:
- Interactivity: All plots are interactive by default, allowing users to zoom, pan, hover and select data points.
- Wide range of chart types: Plotly offers a plethora of options, from basic charts like line charts and bar charts to advanced charts like sunburst charts and specialised charts like candlestick charts and geographic maps.
- High level and low level API: Plotly simplifies the creation of common visualisations with minimal code through the plotly.express module, but also allows you to customise the finer details through plotly.graph_objects, which provides full control over every aspect of the plot, allowing for highly customized visualizations
- Web-ready visualisations: Another USP of Plotly over both Seaborn and Matplotlib is that plots can be embedded in web applications (e.g., Flask, Django) or exported as standalone HTML files. It is also compatible with Jupyter notebook and Dash (Plotly’s dashboarding framework).
- 3D and Geographic visualisations: You can create 3D scatter plots, surface plots, and volumetric visualizations as well as interactive maps with geographic data, such as choropleth maps and scatter maps.
- Export options: Plotly allows you to export plots as static images (like PNG and PDF) or interactive HTML and JSON.
Limitations:
- Performance issues: Like Matplotlib and Seaborn, Plotly struggles with very large datasets.
- Limited customisation: While Plotly offers plenty of customisation, Matplotlib is the gold standard for static plots used for academic papers and printed materials.
- Dependency on Web Technologies: Since Plotly relies on JavaScript and browser rendering, it may not be ideal for environments where web technologies are restricted or unsupported.
- File size of HTML Exports: Interactive HTML files generated by Plotly can be large, especially for complex plots, which may make sharing or embedding them cumbersome.
- Limited Built-in statistical tools: Unlike Seaborn, Plotly does not have built-in statistical capabilities (e.g., regression lines, confidence intervals). You need to calculate these separately and then plot them.
- Commercial licensing for Enterprise use: Unlike Seaborn, Plotly does not have built-in statistical capabilities (e.g., regression lines, confidence intervals). You need to calculate these separately and then plot them.
Other libraries to consider
1. Bokeh
Bokeh is an alternative to Plotly. While both Bokeh and Plotly have open-source licenses, but for some more advanced features, Plotly requires a paid subscription. Here are some key differences between the two:
- Bokeh is considered better for large datasets
- Bokeh offers a built-in server for real-time updates and Python callbacks, while Plotly requires Dash for server-side compatibility.
- Bokeh offers more customization than Plotly.
- Bokeh also offers real-time streaming data, which Plotly lacks.
- Bokeh offers standalone HTML deployment which doesn’t rely on external frameworks, unlike Plotly, which allows for streaming data and standalone widgets.
Limitations:
- Bokeh requires more code than Plotly
- While Bokeh’s visuals are good, they don’t look as polished as Plotly’s out-of-the-box.
- Bokeh has limited 3rd party integrations.
2. Altair
Altair is a declarative visualisation library based on the Vega-Lite. In simple terms, you just describe what you want the chart to look like, and the library figures out how to make it happen.
This makes it the easiest tool to make interactive visualisations – even more than Bokeh and Altair.
What makes Altair unique?
Altair is designed for the data analyst. It can do things like grouping, filtering, or creating bins (e.g., dividing data into ranges) without you needing to process the data first.
Altair is not suitable for interactive charts, big data, customisation and real-time updates. It’s designed for data analysts – a quick, and easy way to explore data, and it does that job better than any other library.
3. Pygal
Pygal is another visualisation library with a focus on generating scalable vector graphics (SVG).
SVGs are ideal for embedding in web pages as they are resolution-independent. However, Pygal doesn’t handle large datasets very well, and offers limited interactivity.
4. Geoplotlib

Geoplotlib is a Python library designed specifically for geospatial data visualization.
Use geoplotlib when:
- You need static geospatial visualizations.
- You’re working with large geospatial datasets and need fast rendering.
- You’re focused exclusively on geospatial data and don’t need other types of visualizations.
- You prefer a lightweight, standalone library.
Dependant libraries
In most cases, visualisation libraries don’t work on their own. There are a lot of libraries that can be used with visualisation libraries, but there’s two which are indispensable:
1. Pandas
This is the go-to library for data manipulation, analysis and exploration. In addition to these, Pandas also offers built-in functions for descriptive statistics, time series analysis and group-by operations.
2. Numpy
NumPy (short for Numerical Python) is a fundamental Python library for numerical computing. It provides powerful tools for working with large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays efficiently. While Pandas offers some built-in functions, Numpy specializes in it.
In addition to Pandas and Numpy, for very large datasets, Vaex and PySpark are significantly more suitable.
Conclusion
Now that you’ve learned about the many libraries you can use in Python, you can create visualisations far beyond the scope of Tableau and PowerBI. Here’s a quick recap of how to use each library:
Use Matplotlib to create highly customised static graphs which are ideal for print and academic purposes.
Use Seaborn when you prefer simplicity over customisation and would rather leave the finer details to the (very capable) library.
Use Plotly to create high-quality, interactive and web-ready visualisations with geographic and 3D visualisation capabilities.
Use Bokeh when you want to create interactive visualisations with very large data sets, and advanced customisation.
Use Altair if your goal is simply exploring data.
Use Pygal to create resolution-independent SVGs which are ideal for embedding in web pages for small to medium datasets, but don’t expect much interactivity.
Use Geoplotlib for large datasets to generate static geospatial visualisations.
Now that you know what’s possible, the only limit is your imagination. Happy plotting!